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PREFACE 


Data  used  in  this  study  was  collected  as  Task  024  under  Contract 
F41689-86-D0052  by  Universal  Energy  Systems,  Inc.,  for  the  Technical 
Training  Division  of  the  Armstrong  Laboratory’s  Human  Resources 
Directorate. 


A  COMPARISON  OF  DOMAIN  SAMPLING  PROCEDURES 
FOR  TEST  CONSTRUCTION 

CHAPTER  1 
INTRODUCTION 

In  education,  business,  industry,  and  the  military  it  is  common  practice  to  assess 
an  individual's  current  sidlis  or  level  of  knowledge  of  a  subject  area  by  use  of  a  test, 
typically  a  multiple  choice  test.  This  type  of  test,  often  called  an  achievement  test,  is 
held  distinct  from  a  test  aimed  at  determining  an  individual's  potential  future  skills  or 
knowledge,  often  called  an  aptitude  test.  An  achievement  test  developer  can  be  any¬ 
one  --  a  personnel  director,  classroom  teacher,  researcher,  or  a  trained,  experienced 
test  development  team.  The  test  itself  can  range  from  a  brief,  short-answer  test  to 
assess  basic  mathematical  abilities  to  a  lengthy,  comprehensive  licensing  examina¬ 
tion.  However,  the  tests  are  developed  to  reach  a  common  general  goal  -  to  assess 
an  individual's  current  skills  and/or  knowledge  in  a  given  subject  domain. 

The  specific  goal  of  the  test  brings  to  the  fore  another  distinction  in  the 
classification  of  tests.  Tests  can  be  classified  either  as  norm-referenced  or  criterion- 
referenced,  although  these  categories  are  not  mutually  exclusive.  The  1985  Stan¬ 
dards  for  Educational  and  Psychological  Testing  (American  Educational  Research 
Association,  American  Psychological  Association,  and  National  Council  on  Measure¬ 
ment  in  Education)  defined  a  norm-referenced  test  as  "an  instrument  for  which 
interpretation  is  based  on  the  comparison  of  a  test  taker's  performance  to  the 
performance  of  other  people  in  a  specified  group”  (p.92).  Popham  (1978)  stated  that  a 
norm-referenced  test  is  designed  to  "ascertain  an  examinee's  status  in  relation  to  the 
performance  of  a  group  of  other  examinees  who  have  completed  the  test"  (p.24}. 
Messick  (1989)  referred  to  norm-referenced  score  interpretation  that  "indicates  where 
the  examinee  stands  relative  to  other  people  who  took  the  test"  (p.44).  Nitko  (1984) 
referred  to  norm-referencing  scores  as  "those  that  convey  to  the  knowledgeable  test 
interpreter  information  about  an  examinee's  standing  relative  to  others  in  a  defined 
group”  (p.8).  These  definitions  emphasize  that  norm-referenced  test  scores  are  used 
to  infer  relative  ability  or  achievement  rather  than  a  degree  or  absolute  level  of 
achievement  or  ability  in  a  domain. 

The  concept  of  a  criterion-referenced  test  is  somewhat  abstract  and  is  still  in  the 
process  of  formulation  by  workers  in  the  field  (Nitko,  1984).  Popham  (1978)  defined  a 
criterion-referenced  test  as  one  "used  to  ascertain  an  individual's  status  with  respect  to 
a  well-defined  behavioral  domain"  (p.93).  Some  authors  make  the  distinction  between 
a  criterion-referenced  test,  a  domain-referenced  test,  an  objective-referenced  test,  and 
a  mastery  test  (Nitko,  1984).  When  this  distinction  is  made,  the  term  "criterion- 
referenced  t^  often  implies  a  test  with  an  associated  cut-off  or  passing  score  that 
represents  mastery/nonmastery  status.  The  term  "domain-referenced  test"  often  refers 
to  the  ability  of  a  test  score  to  describe  an  examinee's  status  on  a  well-defined  domain 
of  behaviors,  with  no  cut-off  score  implied.  An  objective-referenced  test  is  a  test  vWth 
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each  item  corresponding  to  a  behavioral  objective.  A  mastery  test  is  defined  as  any 
test  used  to  provide  information  about  whether  or  not  a  pupil  has  mastered  a  given 
instructional  goal.  Mastery  is  usually  conceived  "as  'knowing  more  of  a  domain" 
(Nitko,  1984,  p.23). 

The  1985  Standards  for  Educational  and  Psychological  Testing  (American  Edu¬ 
cational  Research  Association,  et  al.)  defined  a  criterion-referenced  test  as  one  that 
"allows  Its  users  to  make  score  interpretations  in  relation  to  a  functional  performance 
level,  as  distinguished  from  those  interpretations  that  are  made  in  relation  to  the 
performance  of  others"  (p.90).  The  1985  Standards  defined  a  domain-referenced  test 
as  one  that  "allows  users  to  estimate  the  amount  of  a  specified  content  domain  that  an 
individual  has  learned"  (p.91).  The  two  definitions  are  cross-referenced,  indicating  the 
overlap  between  them.  Messick  (1989)  made  the  distinction  between  a  criterion- 
referenced  interpretation,  that  "treats  the  score  as  a  sign  that  the  respondent  can  or 
cannot  be  expected  to  satisfy  some  performance  requirement  in  a  situation  unlike  the 
test"  (p.44)  and  a  domain-referenced  interpretation  that  "treats  the  score  as  a  domain 
sample  indicating  what  level  of  difficulty  the  person  can  cope  with  on  tasks  like  those 
in  the  test"  (Cronbach,  1984,  p.44). 

Gronlund  (1976)  noted  that  the  terms  domain-referenced,  criterion-referenced, 
objective-referenced  and  universe-referenced  have  been  used  by  some  authors  with 
somewhat  the  same  meaning.  Nitko  (1984)  noted  that  the  term  "domain-referencing" 
might  be  preferable  to  "criterion-referencing"  as  the  commonly-used,  preferred  term 
but  that  testing  specialists  have  decided  that  "criterion-referencing"  should  remain  the 
preferred  term  for  a  variety  of  reasons. 

For  the  purposes  of  this  paper,  the  broad  definition  of  a  criterion-referenced  test 
as  presented  by  Glaser  and  Nitko  will  be  used.  This  definition  states  that  a  criterion- 
referenced  test  "is  one  that  is  deliberately  constructed  to  yield  measurements  that  are 
directly  interpretable  in  terms  of  specified  performance  standards"  (1971,  p.653).  The 
performance  standards  are  specified  by  defining  a  class  or  domain  of  tasks  that  the 
individual  should  be  able  to  perform.  From  this  domain  of  tasks,  measurements  are 
taken  on  "representative  samples  of  tasks  drawn  from  the  domain"  (p.653).  Criterion- 
referenced  tests  "are  specifically  constructed  to  support  generalizations  about  an 
individual’s  performance  relative  to  a  specified  domain  of  tasks"  (p.653).  Using  this 
definition,  a  criterion-referenced  test  can  be  used  to  make  a  mastery  decision;  how¬ 
ever,  it  is  not  assumed  that  the  assignment  of  mastery/nonmastery  status  is  the  goal  of 
the  test.  A  treatment  of  the  issues  related  to  setting  of  cut-off  scores  is  beyond  the 
scope  of  this  paper. 

As  previously  mentioned,  the  categories  of  norm-referenced  tests  and  criterion- 
referenced  tests  are  not  mutually  exclusive.  Nitko  (1984)  noted  that  a  test  can  provide 
both  norm-referencing  and  criterion-referencing  information.  He  stated  that  "norm- 
referenced  data  are  needed  to  interpret  fully  an  examinee's  criterion-referenced  test 
performance"  and  that  "criterion-referencing  and  norm-referencing  provide  comple¬ 
mentary  information"  (p.25).  Miliman  and  Greene  (1989)  noted  that  when  both 
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interpretations  are  desired,  the  specifications  of  test  content  should  "clearly  delineate 
the  bases  of  both  sets  of  inferences"  (p.342).  Also,  Messick  (1989)  warned  that  when 
two  or  more  scoring  measurement  models  are  combined,  confusion  can  result  about 
what  construct  theory  to  reference  and  what  kinds  of  construct  validity  evidence  should 
be  investigated. 

Criterion-referenced  tests  have  gotten  much  attention  in  the  testing  arena  in  the 
last  severed  years.  Popham  (1978)  noted  that  the  expression  "criterion-referenced 
measurement"  was  first  used  in  1962  (Glaser  &  Klaus,  1962).  Recent  emphasis  on 
accountability  in  testing,  formative  evaluation,  computer-assisted  instruction,  and 
individualized  instruction  has  resulted  in  widespread  interest  in  criterion-referenced 
tests  that  can  be  used  to  make  instructional  and  program  decisions  (Mehrens  & 
Lehmann,  1980).  However,  the  issue  of  the  interpretation  of  a  test  score  has  been  of 
interest  for  a  great  many  years.  Popham  (1978)  noted  that  E.  L.  Thorndike,  in  1913, 
raised  the  issue  of  an  absolute  versus  a  relative  interpretation  of  test  scores, 
suggesting  that  while  a  teacher  giving  marks  for  "some  obscure  standards  of  absolute 
achievement"  may  know  what  those  marks  represent  in  terms  of  achievement,  the 
student  and  others  can  only  interpret  them  in  terms  of  standing  relative  to  other 
students.  The  goal  of  those  involved  in  criterion-referenced  test  development  has 
been  to  overcome  the  problem  of  test  score  interpretation,  developing  tests  that  give 
both  the  test  user  and  the  examinee  meaningful  information  about  what  the 
examinee's  performance  on  the  test  actually  reflects  (Popham,  1978). 

There  are  many  decisions  that  a  test  developer  must  make  in  constructing  a 
criterion-referenced  test  and  many  constraints  to  the  options  available.  Among  other 
things,  the  test  developer  must  determine  what  the  test  is  to  measure,  what  the  test 
scores  are  to  be  used  for,  the  level  of  detail  to  be  tested,  the  format  of  the  test,  and  the 
length  of  the  test.  Additionally,  the  issues  of  test  reliability  and  validity  must  be 
addressed  if  one  is  to  have  any  confidence  in  the  usefulness  of  the  test  results. 

Typically,  the  general  content  domain  of  the  test  the  level  of  detail  of  the  test 
items,  and  the  purpose  for  which  the  test  scores  are  to  be  used  are  specified  at  the 
outset.  The  item  format  chosen  is  often  a  function  of  both  what  is  to  be  measured  and 
the  objectivity  and  ease  of  scoring  required  by  the  situation.  In  skilis/knowiedge  tests 
the  multiple  choice  format  is  commonly  chosen  due  to  its  objectivity  and  speed  and 
ease  of  scoring.  However,  content  of  the  test  items  is  a  matter  that  is  often  left  to  the 
judgment  of  the  test  developer. 

In  dealing  with  issues  of  test  content  selection  it  is  useful  to  have  a  set  of 
categories,  or  definitions,  in  mind.  In  his  discussion  of  work  sample  test  development 
and  content  validity,  Quion  (1979)  referred  to  the  set  of  all  possible  behaviors  relevant 
to  the  measurement  goal  (job  performance)  as  the  job  content  universe.  That  portion 
of  the  job  content  universe  identified  for  testing  was  labelled  the  job  content  domain. 
The  set  of  all  possible  test  items  that  can  be  developed  for  the  job  content  domain  was 
referred  to  as  the  test  content  universe.  Finally,  the  sample  of  items  taken  from  the  test 
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content  universe  to  make  up  the  test  was  called  the  test  content  domain.  Test  content 
selection  in  this  framework  is  seen  as  a  process  of  successive  sampling. 

The  1985  Standards  defined  "content  domain"  as  "a  body  of  knowledge,  skills, 
and  abilities  defined  so  that  items  of  knowledge  or  particular  tasks  can  be  clearly 
identified  as  included  or  excluded  from  the  domain"  (p.90).  Domain  sampling  was 
defined  as  the  "process  of  selecting  test  items  to  represent  the  specific  universe  of 
performance  in  which  a  test  developer  is  interested"  (p.91 ). 

Hambleton  (1984)  noted  that  in  the  typical  criterion>referenced  testing  situation, 
a  real  or  hypothesized  domain  or  population  of  test  items  is  available.  He  defined 
domain  score  as  "the  expected  or  true  proportion  of  items  that  an  examinee  can 
answer  correctly  from  the  whole  domain  or  population  of  items"  (p.145). 

For  the  purposes  of  this  paper  a  definitional  scheme  is  used  that  is  similar  to 
Quion's  and  consonant  with  the  1985  Star  Jards.  Figure  1  illustrates  this  definitional 
scheme.  The  term  content  domain  is  used  to  refer  to  the  body  of  knowledge,  skills, 
and/or  abilities  identified  as  the  target  of  measurement.  The  set  of  ail  possible  items 
that  could  be  developed  for  the  content  domain  is  referred  to  as  the  test  content 
universe.  The  test  content  sample  will  be  defined  as  the  sample  of  items  selected  from 
the  test  content  universe  to  make  up  one  form  of  the  test.  In  practice,  because  the  test 
content  universe  is  rarely  defined,  the  content  domain  is  often  directly  sampled  and 
test  items  developed  based  on  that  sample.  In  the  literature  on  test  construction  and 
interpretation,  many  authors  make  no  distinction  between  sampling  the  content 
domain  and  sampling  the  test  content  universe.  Thus,  when  reference  is  made  to 
domain  sampling  it  is  assumed  that  domain  sampling  also  includes  the  associated 
sampling  of  items  from  the  test  content  universe.  The  term  domain  score  refers  to  the 
expected  or  true  percentage  of  items  from  the  test  content  universe  that  an  examinee 
can  answer  correctly. 

In  the  development  of  a  test  it  is  rarely  possible  to  construct  and  administer 
items  that  completely  exhaust  the  content  domain.  Time  and  expense  considerations 
constrain  what  can  be  covered  in  any  given  testing  situation.  Thus,  unless  the  conteni 
domain  is  very  narrowly  defined,  it  is  necessary  to  rely  on  samples  of  test  items  from 
the  test  content  universe  to  estimate  an  individual's  domain  score.  The  quality  of  the 
generalizations  or  inferences  made  from  resultant  test  scores  is  directly  related  to  the 
quality  of  the  content  domain  definition  and  the  quality  of  the  sampling  of  the  test 
content  universe.  To  make  valid  generalizations  to  the  content  domain,  that  domain 
must  be  well-defined  and  the  item  sample  must  be  relevant  to  and  representative  of  it. 
This  requirement  to  represent  the  content  domsun  also  extends  to  the  selection  of 
types  of  items,  item  quality,  and  the  administration  and  scoring  procedures  used.  The 
critical  question  becomes:  To  what  extent  is  a  person's  observed  score  on  this  test 
likely  to  reflect  his/her  standing  on  the  content  domain?  This  is  a  question  of  test 
validity. 
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percent  correct, 
for  items  in  Test 
Content  Sample) 


Figure  I.  The  Definitional  Framework  for  This  Study. 


Validity  concerns  how  well  a  test  measures  what  it  purports  to  measure 
(Anastasi,  1982;  Alien  &  Yen.  1979).  Thus,  validity  refers  to  the  accuracy  of  predictions 
or  inferences  made  from  test  scores  (Cronba'^h,  1971).  Validity  must  be  established 
taking  into  consideration  the  particular  use  of  the  test  (Anastasi,  1982). 

Quality  tests  are  constructed  with  validity  in  mind.  The  test  developer  aims  to 
develop  a  test  that  measures  the  characteristic  he/she  has  set  out  to  measure,  whether 
it  is  a  trait,  aptitude,  or  achievement. 

The  first  step  in  test  development  is  the  specification  of  what  is  to  be  measured. 
The  content  domain  identifies  and  defir.es  the  target  of  measurement.  The  test  content 
universe  can  then  be  specified,  theoretically,  as  all  possible  good  quality  test  items 
that  can  be  developed  for  the  content  domain.  Obviously,  it  is  ra''dly  practical  or 
possible  to  specify  the  entire  test  content  universe.  Test  specifications  typically  consist 
of  a  content  outline  that  specifies  the  proportion  of  the  items  from  each  content  area  in 
the  outline.  A  sample  of  items  is  selected  or  constructed  in  accordance  with  the  test 
specifications. 
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It  is  in  this  process  of  content  specification  and  item  sampling  that  the  content 
validity  of  the  measure  is  ultimately  determined.  While  this  construction  process  is  the 
focus  of  content  validity  evaluation,  it  also  has  direct  impact  on  the  criterion-related 
and  construct  validity  of  the  test.  Misspecification  of  any  of  the  areas  of  interest,  from 
the  content  domain  to  the  test  content  universe,  or  an  inappropriate  test  content 
sampling  procedure,  will  result  in  the  measurement  of  something  other  than  what  was 
intended. 

The  need  for  research  dealing  with  the  particulars  of  test  specification  has  been 
recognized.  Berk  (1984a)  stated  that  such  research  is  "sadly,  ...  almost  totally 
nonexistent"  (p.32).  Referring  to  a  comprehensive  review  of  research  on  criterion- 
referenced  testing  by  Hambleton,  Swaminathan,  Aigina,  and  Coulson  (1978),  Berk 
noted  that  only  the  work  of  Ebel  (1962)  and  Hively,  Patterson,  and  Page  (1968) 
discussed  the  topic  of  test  specifications,  and  neither  empirically  investigated  the 
efficacy  of  various  forms  of  test  specifications. 

The  current  interest  in  and  growing  dependence  on  criterion-referenced  tests  to 
make  meaningful  instruction,  selection,  classification,  certification,  and  program  evalua¬ 
tion  decisions  make  it  critical  that  test  developers  have  information  to  help  in  making 
content  selection  decisions.  While  expert  judgment  about  a  test's  content  representa¬ 
tiveness  has  served  in  the  past  to  answer  challenges  to  test-based  decisions, 
empirical  information  is  needed  to  justify  test  content  decisions.  This  research 
addressed  this  need  by  evaluating  the  effects  of  different  content  selection  strategies 
on  tests  covering  a  specified  content  domain.  Reliability  and  validity  of  tests 
developed  through  different  content  selection  strategies  were  evaluated  and 
compared.  Also,  because  test  development  time  and  testing  time  often  constrain  the 
number  of  items  that  can  be  develops  and  administered  (thus,  constraining  domain 
coverage),  the  effects  of  test  length  also  was  considered. 
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CHAPTER  II 


RESEARCH  LITERATURE  REVIEW 
Test  Construction  and  Use 

The  primary  uses  of  criterion-referenced  tests  are  for  educational  and  occupa¬ 
tional  decision  making.  These  tests  are  frequently  used  to  determine  if  an  individual 
has  attained  the  skills  and  knowledges  that  are  the  goal  of  the  educational  process  or 
if  the  individual  has  the  skills  and  knowledges  requisite  for  a  given  job.  Such  tests  are 
often  constructed  by  a  teacher,  trainer,  or  personnel  specialist  who  must  decide  what 
exactly  to  include  in  the  test  and  how. 

While  the  planned  use  of  the  test  scores  determines  whether  it  will  be  criterion- 
referenced  or  norm-referenced,  there  is  little  difference  in  the  test  construction  tactics 
used.  The  selection  of  item  type  (e.g.,  essay  vs.  multiple-choice),  item  construction 
rules,  and  administration  procedures  do  not  differ  to  any  real  degree  in  criterion- 
referenced  and  norm-referenced  tests  (Popham,  1978).  Tinkelman  (1971)  and 
Millman  &  Greene  (1989)  covered  in  detail  the  steps  taken  in  planning  an  objective 
test.  Green  (1981)  presented  an  overview  of  test  construction,  administration  and  use, 
placing  his  discussion  in  the  context  of  multiple-choice  group  testing  of  cognitive 
ability.  Guidelines  for  the  construction  of  tests  also  can  be  found  in  Gronlund  (1968  & 
1976),  Shields  (1965)  and  Popham  (1978).  Roid  and  Haladyna  (1982)  gave  in-depth 
coverage  to  test  item  writing.  Extensive  treatment  of  tests  and  measurement  has  been 
given  by  Anastasi  (1982).  Thorndike  (1971)  and  Linn  (1989)  have  provided  an 
encyclopedic  treatment  of  the  area  of  test  construction  and  use.  giving  in-depth 
treatment  to  a  broad  range  of  issues  and  concepts  such  as  test  design,  construction, 
administration,  processing,  test  theory  and  application.  A  good  coverage  of  technical 
issues  in  the  field  of  testing  has  been  given  by  Allen  and  Yen  (1979).  Lord  and  Novick 
(1968)  and  Lord  (1980)  have  provided  advanced  treatments  of  technical  issues  in 
testing.  Specific  attention  to  criterion-referenced  testing  within  a  more  general  treat¬ 
ment  of  testing  was  given  by  Crocker  and  Algina  (1986). 

Selection,  Of  lest  Contertt 

The  real  difference  in  the  construction  of  a  criterion-referenced  test  is  in  content 
selection.  Of  course,  norm-referenced  test  content  should  be  related  to  the  content 
domain.  However,  if  overall  content  relevance  can  be  shown  and  predictive  validity 
can  be  demonstrated,  the  descriptive  quality  of  a  norm-referenced  test  content  is  not 
held  to  intense  scrutiny.  In  contrast,  a  criterion-referenced  test  is  intended  to  estimate 
the  amount  of  a  spedfied  content  domain  that  an  individual  has  mastered.  Thus,  the 
descriptive  quality  of  the  criterion-referenced  measure  is  a  critical  issue  and  a  major 
problem  facing  criterion-referenced  test  developers.  The  descriptive  quality  of  a  test  is 
a  direct  reflection  of  the  test  content  (Popham,  1978).  Nunnaliy  (1972)  stated  that  the 
major  source  of  error  in  most  psychological  measures  relates  to  the  sampling  of 
content. 


7 


Several  authors  have  addressed  the  issue  of  what  Popham  refers  to  as  the 
test's  descriptive  scheme  (1978).  Popham  includes  in  the  rubric  of  "descriptive 
scheme"  anything  from  a  simple  behavioral  objective  to  an  elaborate  set  of  test 
specifications.  The  purpose  of  the  descriptive  scheme  is  to  communicate  to  the  item 
writers  what  kind  of  items  are  to  be  included  in  the  test  and  to  test  users  what  the  test  is 
measuring.  There  are  many  approaches  to  developing  a  "descriptive  scheme,"  or  test 
specifications,  for  a  test. 

Typically,  the  test  constructor  (or  test  construction  team)  has  a  good  general 
idea  of  the  content  domain  to  be  covered  by  the  test,  it  could  be  an  instructional  area, 
a  job,  or  an  area  of  certification.  The  test  constructor  is  faced  with  the  task  of  defining 
the  precise  content  domain  and  determining  which  content  elements  should  be  tested, 
since  it  is  virtually  impossible  to  test  everything  in  the  domain  because  of  time 
constraints.  The  definition  of  the  content  domain  can  vary  widely  from  test  to  test.  For 
example,  the  definition  of  the  content  domain  may  be  fairly  general  and  broad  --  such 
as  a  listing  of  major  historical  events  covered  by  a  history  class,  it  may  be  more 
specific  as  with  well-written  educational  objectives,  or  the  definition  may  be  highly 
detailed  -  as  with  tests  based  on  a  detailed  job  analysis  used  to  make  employment 
decisions. 

The  descriptive  scheme  also  may  indude  the  type  of  behavior  the  examinee 
should  exhibit  for  each  content  area.  This  often  reflects  a  taxonomy  -  such  as  the 
Taxonomy  of  Educational  Qbiectives  (Bloom,  Englehart,  Hill,  Furst,  &  Krathwohl,1956; 
Krathwohi  &  Payne,  1971),  that  outlines  categories  of  knowledge,  intellectual  abilities 
and  skills.  This  yields  an  often-used,  two-way  outline,  or  test-blueprint  chart.  Elements 
of  the  outline  are  usually  weighted  on  importance,  and  these  weights  determine  the 
relative  emphasis  (i.e.,  number  of  test  items)  the  element  receives  in  the  test  (Adkins- 
Wood.  1961;  Gronlund,  1968;  Kubiszyn  &  Boilch,  1987).  The  weights  usually  reflect 
judgments  of  the  relative  importance  of  the  elements  to  the  goals  of  instruction  or  job; 
they  are  not  a  direct  reflection  of  the  breadth  of  the  content  area  or  the  number  of 
possible  items  assodated  with  the  element.  An  outline  without  weights,  in  which  each 
element  has  an  equal  number  of  items,  reflects  an  underlying  equal  weighting 
scheme.  It  is  possible  that  some  tests  are  constructed  without  an  a  priori  weighting  of 
content  area,  such  as  when  more  items  are  constructed  on  content  areas  in  which  item 
construction  is  easy  or  in  an  area  favored  by  the  test  constructor  (Adkins-Wood,  1961); 
however,  this  is  not  good  test  construction  practice. 

The  test  outline  is  used  to  guide  test  item  development.  There  is,  theoretically, 
an  underlying  universe  of  test  items  from  which  the  sample  of  test  items  is  taken.  Item 
development  constitutes  sampling  from  the  universe;  this  sampling  is  assumed  to  be  a 
random  or  a  stratified  random  process.  The  weighted  test  outline  typically  can  be 
viewed  as  a  stratified  random  sampling  procedure,  as  can  the  construction  of  almost 
any  mental  test  (Lord  and  Novick,  1968). 


In  a  review  of  the  issue  of  content  representativeness,  Messick  (1989)  pointed 
out  that  the  notion  of  content  sampling  has  not  been  universally  accepted.  He  noted 


Loevinger’s  0965)  questioning  of  the  notion  of  sampling  when  no  actual  universe  of 
items  or  testing  situations  exists  and  when  items  are  constructed,  not  sampled.  This 
argument  was  countered  by  Cronbach's  (1971)  assertion  that  the  important  require¬ 
ment  is  that  the  boundaries  of  the  universe  be  sufficiently  well  specified  to  allow  one  to 
decide  whether  any  particular  item  is  included  in  the  universe.  Messick  pointed  out 
that  the  assumption  of  sampling  from  a  universe  allows  the  use  of  inferential  models  to 
make  inferences  to  a  universe  of  items  or  tasks  like  those  constructed  or  observed; 
thus,  one  can  generalize  from  sample  performance  to  universe  performance. 

Messick  suggested  that  there  is  the  trivial  sense  of  sampling  items  from  a  large 
previously  constructed  pool.  However,  sampling  from  a  large  item  pool  is  not  sampling 
from  the  test  content  universe  unless  the  item  pool  is  coterminous  with  the  universe. 
He  suggested  that  it  would  be  nontrivial  if  the  operative  properties  of  all  items  that 
could  possibly  appear  in  the  universe,  and  thus  the  test,  could  be  specified.  In  that 
case  the  adequacy  of  the  coverage  of  the  universe  could  be  appraised. 

Often  in  certification,  selection,  or  classification  tests  the  content  area  is  highly 
detailed,  and  the  issue  of  weighting  becomes  critical.  The  constructor  of  such  tests  is 
often  called  upon  to  defend  the  content  selection  scheme,  as  with  the  National 
Academy  of  Science  review  of  military  job  performance  testing  (Wigdor  &  Green, 
1986).  However,  little  information  is  given  in  the  research  literature  on  the  impact  of 
various  weighting  schemes  on  test  properties.  Adkins-Wood  (1961)  warned  that  "the 
real  or  effective  weights  of  different  components  are  not  always  what  they  appear  to 
be"  (p.36).  She  went  on  to  explain  that  variance  in  the  different  test  components 
(content  areas)  as  well  as  in  correlated  components  affects  the  contribution  of  each 
component  to  the  information  provided  by  the  total  test  in  terms  of  determining 
individual  differences.  For  example,  if  all  examinees  get  the  same  score  on  all  the 
items  in  a  test  component,  that  component  tells  you  nothing  about  individual 
differences  except  that  there  are  none  on  that  component.  This  is  a  major  issue  in 
norm-referenced  tests,  where  interest  is  usually  in  the  differences  that  exist  rather  than 
absence  of  differences.  Also,  if  responses  on  items  from  two  different  test  components 
are  correlated,  an  individual's  total  test  score  is  dependent  on  something  more  than  is 
reflected  by  the  test  outline.  This  could  result  in  unclear  test  score  interpretation. 

Glaser  and  Nitko  (1971)  stated  that  criterion-referenced  tests  are  constructed  to 
support  generalizations  about  an  individual's  performance  relative  to  a  domain  of 
instructionally  relevant  tasks.  Thus,  criterion-referenced  tests  are  appropriate  only  to 
well-defined  domains  in  which  it  is  clear  which  categories  of  performance  or  kinds  of 
tasks  are  and  are  not  potential  test  items  (Nitko.  1984).  Nitko  also  distinguished 
between  ordered  and  unordered  domains.  An  ordered  domain  might  reflect  the 
varying  degrees  of  subject  matter  difficulty  or  complexity,  degrees  of  proficiency,  pre¬ 
requisite  learning  or  developmental  sequences,  or  latent  trait  location  wherein  the 
behavior  domain  represents  a  single  dimension  or  factor  underlying  performance.  In 
contrast,  many  domains  that  are  important  representations  of  learning  outcomes  can¬ 
not  be  ordered  but  still  require  clear  definition.  Criterion-referenced  tests  vary  widely 
in  the  number  of  items  they  include  and  the  breadth  of  the  content  areas  they  cover. 
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The  issue  of  developing  and  evaluating  a  test's  descriptive  scheme  can  be 
classified  under  the  general  term  of  test  content  theory.  In  the  context  of  this  paper, 
test  content  theory  refers  to  the  rationale,  or  theory,  about  the  content  area  which 
underlies  the  test  specifications  developed  to  guide  test  construction.  Discussions  of 
test  content  theory  and  the  investigation  of  the  test  specifications  span  the  areas  of  test 
reliability  and  validity,  as  well  as  the  reliability  of  the  test's  descriptive  scheme.  A 
review  of  the  research  literature  relevant  to  test  specifications  and  test  content,  test 
reliability,  and  test  validity  follows. 

Linn  (1980a)  noted  that  the  primary  focus  in  achievement  test  construction 
should  be  on  the  content  of  the  items.  He  suggested  that  item  generation  starts  with 
the  definition  of  the  content  domain  complete  enough  that  all  potential  items  can  be 
enumerated,  at  least  implicitly.  He  observed  that  few  examples  of  relatively  complete 
domain  specifications  can  be  found. 

The  most  common  application  of  test  content  theory  is  in  the  development  and 
use  of  a  test  outline,  or  blueprint.  Ideally,  the  elements  of  the  test  blueprint  and  the 
associated  weights  should  reflect  an  underlying  theory  of  what  constitutes  a  competent 
individual  in  a  given  domain  and  the  behaviors  such  an  individual  should  be  able  to 
demonstrate. 

Guttman's  facet  theory  (Berk,  1978)  and  Popham's  (1984)  amplified  objectives 
are  examples  of  intermediate  positions  between  complete  specification  and  the 
traditional  table  of  content  specifications.  Guttman  (1980)  suggested  that  there  have 
been  many  theories  of  test  scores,  but  not  of  content  structure  and  specification.  He 
later  called  for  test  constructors  to  focus  on  the  "sharp  design  of  content”  (1980,  p.93). 
He  stated  that  facet  theory  provides  a  fruitful  design  of  content  and  that  "proper 
treatment  of  content  can  be  done  only  in  the  context  of  theory  construction"  (1980, 
p.94).  Guttman  made  the  distinction  between  a  taxonomy  and  a  theory,  asserting  that 
a  taxonomy  refers  only  to  the  definitional  p»1  of  a  theory,  but  by  Itself  is  not  a  theory. 
He  suggested  that  facet  theory  relates  two  basic  features  of  an  observational  system; 
1)  the  framework  for  defining  the  content  of  the  universe  of  obsenrations  and  2)  the 
empirical  distribution  of  the  observations  carried  out  within  this  framework  of  design. 

Using  Guttman's  approach,  proposed  in  1958  and  1969,  the  investigator 
specifies  the  facets,  or  logical  dimensions,  of  a  domain  in  terms  of  such  aspects  as 
content,  form,  and  complexity.  The  facets  are  then  systematically  crossed  in  a  factorial 
fashion,  yielding  a  Cartesian  product  representing  the  facet  design  of  the  domain. 
This  provkies  the  basis  for  a  mapping  sentence  or  item-generation  rule  for  determining 
the  item  universe.  This  fully  specifies  the  domain  as  well  as  the  items  or  tasks  that 
might  appear  in  the  item  universe  (Dancer,  1986).  Thus,  the  potential  item  universe  is 
specified  (Messick,  1989).  Facet  theory  has  been  applied  most  successfully  in  the 
area  of  attitude  measurement. 


Popham  (1984)  discussed  the  importance  of  an  unambiguous  description  of 
what  a  test  is  measuring  in  the  context  of  criterion-referenced  test  construction.  He 
made  the  point  that  without  unambiguous  specifications,  a  criterion-referenced  test 
has  no  advantage  over  norm-referenced  measures.  He  described  the  ideal  situation 
of  having  explicit  test  specifications  and  congruent  test  items  leading  to  accurate 
interpretation  of  what  an  examinee's  test  performance  means.  He  described  the 
specification  strategy  used  by  the  Instructional  Objectives  Exchange  (lOX).  The 
emphasis  of  this  strategy  is  not  on  overall  description  of  the  content  domain  but.  rather, 
on  specification  of  elements  of  the  domain  and  item  writing  rules. 

Osbum  (1968)  discussed  generali2ation  beyond  test  items  and  the  need  for  an 
unambiguous  basis  for  generalization.  He  suggested  that  the  basis  for  generalization 
"must  be  contained  in  the  operational  definition  of  the  procedures  used  in  generating 
and  sampling  items  that  go  to  make  up  the  test"  (p.96).  To  that  end.  all  possible  items 
should  be  specified  in  advance,  and  random  sampling  or  stratified  random  sampling 
from  the  universe  of  content  should  take  place.  These  are  requirements  of  a  universe- 
defined  test,  which  provides  an  unbiased  estimate  of  an  individual's  score  on  an 
explicitly  defined  universe  of  item  content. 

Messick  (1989)  discussed  the  application  of  "universe-defined"  tests,  proposed 
by  Osbum  (1968)  and  Hively,  Patterson,  and  Page  (1968),  in  which  the  content 
domain  is  analyzed  into  a  hierarchical  arrangement  of  item  forms.  Each  item  form 
contains  wording,  variable  elements,  and  rules  for  replacing  those  elements.  Messick 
noted  that  the  "direction  of  the  argument  flows  not  from  a  domain  specification  to  an 
item  sample  but  from  an  item  form  to  an  item  universe"  (p.40).  He  also  noted  that 
Guttman's  mapping-sentence  approach  is  more  applicable  to  broad  domains. 

Little  is  known  about  how  well  content  specifications  work  and  how  they  might 
be  improved  (Linn,  1980b).  A  "duplicate-experiment"  was  suggested  by  Cronbach 
(1971)  to  validafte  rigorously  the  fit  between  the  operational  definition  of  the  universe 
and  the  actual  test  operations.  This  study,  earlier  approximated  by  Ebei  (1962),  called 
for  the  construction  of  two  versions  of  a  test  by  two  independent  test  construction 
teams  using  the  same  content  specifications  strategy.  The  adequacy  (or  reliability)  of 
the  specifications  would  be  judged  by  the  degree  of  equivalence  between  the  two 
forms. 


In  an  evaluation  of  a  Department  of  Defense  project  to  measure  military  job 
performance,  the  National  Academy  of  Sciences  Committee  on  the  Performance  of 
Military  Personnel  (Wigdor  &  Green,  1986)  suggested  the  use  of  random  sampling 
techniques  to  select  test  content,  contrasting  that  technique  to  a  judgment-based 
sampling  approach.  Each  of  the  various  approaches  used  by  the  Military  Services 
(Army,  Navy,  Air  Force,  and  Marines)  to  construct  performance  measures  for  this 
project  had  a  judgmental  component  in  the  selection  of  test  content  (Human 
Resources  Research  Organization  and  American  Institute  for  Research,  1984; 
Lammiein,  1987;  Lipscomb,  1984;  Maier&  Hiatt,  1985). 
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Discussing  job  proficiency  test  development,  the  Committee  acknowledged 
practical  problems  such  as  insuring  optimal  domain  coverage  with  limited  test  time, 
hands-on  testing  of  dangerous  tasks  or  tasks  involving  expensive  equipment,  and 
"face  validity"  from  the  perspective  of  both  test  takers  and  those  using  the  results. 
While  they  believed  that  an  expert  judgment  approach  to  content  selection  addressed 
those  problems,  the  representativeness  of  the  sample  should  be  the  major  concern. 

The  Committee  asserted  that  a  random  sampling  approach’s  major  contribution 
is  that  it  "permits  a  known  degree  of  representativeness"  (p.50).  Random  sampling 
"allows  one  to  make,  with  known  margins  of  error,  statements  that  can  be  generalized 
to  the  entire  universe"  (p.46).  The  report  noted  that  a  judgmental  approach  to 
sampling  introduces  a  "measurement  bias  that  cannot  be  precisely  estimated"  (p.57). 
The  Committee  pointed  out  that  initial  stratification,  prior  to  content  selection,  could  be 
used  to  increase  precision.  However,  the  report  also  noted  that  sampling  a  content 
domain  was  not  as  simple  as  sampling  from  a  population  of  people;  people  already 
exist  as  separate  units  whereas  organizing  a  domain  into  separable  units  is  a  difficult 
undertaking  (Wigdor  &  Green,  1986). 

Berk  (1980)  compared  six  content  domain  specification  strategies,  each 
attempting  to  provide  an  unambiguous  domain  definition  and  explicit  rules  for 
generating  criterion-referenced  test  items.  He  used  the  criteria  of  clarity,  simplicity, 
availability,  development  time  and  costs,  adaptability,  and  domain  appropriateness. 
He  noted  that  the  precision  of  the  spedfications  was  inversely  related  to  practicality. 
His  evaluation  focussed  on  the  domain  specification-item  writing  linkage  and  did  not 
address  the  overall  definition  of  the  domain. 

Other  potentially  useful  techniques  have  been  suggested  for  investigating  test 
specification  strategies.  Dickinson  (1984)  suggested  sensitivity  analysis  (Fischoff, 
1980)  to  assess  the  effect  of  test  specifications  changes  on  test  responses.  Jarjoura 
and  Brennan  (1983)  demonstrated  the  use  of  multivariate  generalizability  theory 
(Cronbach,  Gleser,  Nanda,  &  Rajaratnam,  1972),  analyzing  data  resulting  from 
multiple  forms  of  a  test.  Kane  (1982)  discussed  a  multifacet  sampling  model,  based  on 
generalizabilty  theory,  which  highlights  the  weaknesses  of  some  routinely  made 
inferences.  He  expressed  the  hope  that  such  a  model  would  encourage  research 
aimed  at  defining  universes  more  precisely.  Covariance  structure  analysis  (Joreskog, 
1978,  Linn  &  Werts,  1979)  has  been  suggested  to  analyze  the  reliability  of  different  test 
forms. 


Gottfredson  (1986)  suggested  procedures  for  determining  the  equivalence  of 
altemative  criterion  measures.  Five  general  aspects  of  equivalence  were  discussed: 
validity,  reliability,  susceptibility  to  compromise  (i.e.,  changes  in  validity  or  reliability 
arith  extensive  use),  financial  cost,  and  acceptability  to  interested  parties.  These 
procedures  also  could  be  applied  to  evaluate  the  equivalence  of  tests  developed 
under  different  content  selection  strategies. 
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In  discussing  the  role  of  content-oriented  procedures  in  developing  job  perform¬ 
ance  measures,  Gottfredson  emphasized  that  "it  is  by  no  means  an  atheoretical  task  to 
define  the  content  domain  of  a  job  or  to  sample  from  it"  (p.30).  She  also  noted  that 
while  the  care  taken  in  enumerating  and  sampling  tasks  in  a  content  domain  creates 
the  aura  of  construct  validity  and  relevance  it  says  nothing  about  the  relevance  of  the 
domain  as  it  was  defined.  She  emphasized  that  the  construct  validity  and  relevance  of 
a  measure  is  not  established  by  detailing  the  techniques  used  to  construct  it  but  by 
research  on  the  resulting  test  scores  and  the  adequacy  of  the  theories  underlying  the 
development  and  interpretation  of  the  measure.  She  suggested  that  the  great  strength 
of  content-oriented  test  construction  for  validation  purposes  is  the  rich  source  of  a 
priori  hypotheses  that  can  be  tested  empirically. 

Domain  Samplino  and  Test  Length 

Test  length  is  usually  constrained  by  limits  of  testing  time  and  time  available  to 
construct  test  items.  Therefore,  it  is  necessary  to  rely  on  a  sample  of  items  from  the  test 
content  universe  to  estimate  an  individual's  true  content  domain  score.  How  well  the 
individual's  score  on  the  sample  of  test  items  reflects  that  individual's  true  content 
domain  score  can  be  affected  by  examinee  guessing,  test  administration  problems, 
ambiguous  items,  and  nonrepresentative  sampling  of  test  items  (Hambleton,  1984a). 
However,  the  impact  of  ambiguous  or  nonrepresentative  items  should  be  less  in  a 
large  sample  of  items  than  in  a  small  Item  sample.  Unfortunately,  many  test 
developers  do  not  have  a  good  grasp  of  how  long  the  test  should  be  and  tend  to 
develop  tests  to  fit  the  time  constraints. 

The  relationship  between  observed  scores  on  the  test  content  sample  and  the 
true  score  on  the  content  domain  is  reflected  in  the  test's  reliability  and  validity.  The 
impact  of  test  length  on  test  characteristics  has  been  widely  investigated  in  the  context 
of  both  classical  test  theory  and  item  response  theory.  Lord  and  Novick  (1968) 
summarized  the  relationships  among  domain  scores,  test  length,  and  reliability.  These 
relationships  are  applicable  to  criterion-referenced  tests  intended  to  estimate  domain 
scores  (Hambleton.  1984a).  Other  authors  have  investigated  the  relationship  of  test 
length  and  classification  errors  when  test  scores  are  used  to  assign  examinees  to 
mastery  states.  Hambleton  (1984a)  reviewed  methods  to  determine  the  test  length 
needed  to  reduce  classification  errors.  Hambleton  reviewed  methods  making  use  of 
the  binomial  model  (Millman.  1972, 1973),  Bayesian  methods  (Novick  &  Lewis,  1974), 
an  "indifference  zone"  (Wilcox,  1976),  computer  simulation  methods  (Eignor  & 
Hambleton,  1979;  Hambleton,  Mills,  &  Simon,  1983)  and  item  response  theory 
(Blmbaum,  1968;  Lord,  1980). 

The  relationship  between  test  length  and  estimates  of  the  content  domain  jcore 
is  highly  relevant  to  investigations  of  test  content  selection.  Classification  accuracy  is 
critical  for  those  criterion-referenced  tests  used  in  making  classification  decisions.  The 
quality  of  the  domain  estimate  is  critical  to  aoojrate  classification  dedsions. 
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The  effect  of  test  length  is  most  apparent  on  the  test's  reliability.  Lord  and 
Novick  (1968)  stated  that  "if  the  length  of  a  test  is  increased  n.  times  by  adding  parallel 
measurements,  the  composite  true-score  variance  increases  by  the  factor 
however,  "the  variance  of  the  composite  error  score  increases  only  by  a  factor  of  (P- 
86).  Thus,  the  true  score  (domain  score)  variance  increases  more  rapidly  than  the 
error  score  variance.  It  is  this  relationship  that  makes  increasing  the  test  length 
beneficial.  This  relationship  of  true  score  variance  to  error  score  variance  is  reflected 
in  the  test's  reliability  coefficient  that,  in  turn,  sets  an  upper  limit  to  the  square  of  the 
test's  validity  coefficients.  Lord  and  Novick  (1968)  also  discussed  the  effect  of  test 
length  on  validity,  noting  that  "validity  increases  more  slowly  with  length  than  does 
reliability"  and  "validity  increases  more  quickly  with  length  when  the  initial  reliability  is 
low,  and  decreases  less  quickly  with  ien^h  when  the  initial  reliability  is  high"  (p.  115). 

Crocker  and  Algina  (1986)  noted,  in  their  discussion  of  test  length  and 
reliability,  that  projections  of  the  reliabilities  of  tests  of  various  lengths  using  the 
Spearman-Brown  prophecy  formula  are  accurate  "only  if  items  added  or  removed  are 
parallel  in  content  and  difficulty  to  items  on  the  original  test”  (p.146).  They  also  note 
that  increases  in  test  length  follow  the  law  of  diminishing  returns,  in  that  doubling  the 
length  of  a  test  will  result  in  a  larger  increcse  in  reliability  than  will  occur  if  the  same 
number  of  items  is  again  added  to  the  test.  This  means  that,  at  some  point,  the  small 
increases  in  reliability  will  not  justify  the  time  and  effort  required  to  add  additional 
Hems.  In  discussing  reliability  coefficients  for  criterion-referenced  tests,  they  note  that 
Increasing  test  length  Increases  the  generalizability  of  the  test  scores  and  the  decision 
consistency.  However,  the  magnitude  of  the  impact  of  test  length  on  decision 
consistency  is  dependent  on  the  specific  s^uation. 

The  problem  for  the  criterion-referenced  test  developer  is  that  these  characteris¬ 
tics  of  tests  assume  a  very  homogeneous  a>ntent  domain  and  the  addition  of  parallel 
items,  which  is  rarely  the  case  when  developing  criterion-referenced  tests  other  than 
those  measuring  simple  functions  such  as  mathematical  skills.  Additionally,  the 
methods  traditionally  used  to  assess  test  reliability  were  developed  for  norm- 
referenced  tests  and  have  limited  application  to  criterion-referenced  tests. 

Assessment  of  Test  Quality 


Test  Reliability 

The  increased  use  of  criterion-referenced  testing  (vice  norm-referenced  testing) 
has  led  to  the  development  of  new  techniques.  These  techniques  address  the 
different  goals  of  a  criterion-referenced  test  (i.e.,  assessing  the  degree  of  mastery  of  a 
domain  or  assigning  a  mastery  classification).  However,  consistency  of  measurement 
is  a  common  goal  for  both  criterion-referenced  test  and  norm-referenced  tests. 

There  are  several  key  reasons  why  traditional  techniques  for  estimating  the 
reliability  of  nonn-referenced  tests  are  not  appropriate  for  criterion-referenced  tests 
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(Popham,  1978).  First,  norm-referenced  test  reliability  assessment  techniques  rely  on 
correlational  procedures.  These  techniques  require  an  adequate  amount  of  variance 
in  the  test  responses  for  meaningful  results  to  be  obtained.  Unlike  norm-referenced 
tests,  criterion-referenced  tests  are  not  deliberately  designed  to  yield  variability  in  test 
responses. 

Additionally,  even  with  sufficient  variance,  correlations  of  test  scores  only  reflect 
the  relative  degree  of  association.  For  example,  a  test-retest  analysis  could  reflect 
high  agreement  in  the  relative  standing  of  individuals  taking  the  test  on  two  occasions 
while  the  test  scores  could  reflect  markedly  different  levels  of  domain  mastery  across 
the  two  testing  situations. 

Finally,  internal  consistency  estimates,  while  often  applied  to  criterion- 
referenced  tests,  reflect  only  the  homogeneity  of  the  items,  internal  consistency  is 
most  relevant  to  the  investigation  of  test  item  characteristics  when  the  goal  of  the  test  is 
to  assess  competency  on  a  homogeneous  domain.  Wick  (1973)  suggested  that  the 
notion  of  reliability  is  difficult  to  interpret  and  apply  to  criterion-referenced  tests. 

Several  reviews  and  assessments  have  been  made  of  the  techniques 
developed  to  estimate  criterion-referenced  test  reliability.  These  have  been  presented 
in  the  literature  by  Hambleton,  Swaminathan,  Algina,  and  Coulson  (1 978),  Linn  and 
Werts  (1979),  Millman  (1979),  Berk  (1980b),  Brennan  (1980),  Shepard  (1980), 
Subkoviak  (1980),  Traub  and  Rowley  (1980),  Berk  (1984c),  and  Crocker  and  Algina 
(1986). 

In  their  treatment  of  reliability  assessment  for  criterion-referenced  tests,  Crocker 
and  Algina  (1986)  distinguished  between  the  two  purposes  of  such  tests  and 
discussed  the  assessment  of  their  reliability  in  the  context  of  the  purpose  of  the  test. 
One  purpose  discussed  was  the  estimation  of  domain  scores.  The  other  purpose  was 
the  assignment  of  mastery/nonmastery  status,  or,  mastery  allocation.  In  addition  to 
these  two  categories,  Berk  (1984)  discussed  the  category  of  reliability  estimates 
relevant  to  the  reliability  of  criterion-referenced  scores.  These  techniques  were 
covered  by  Crocker  and  Algina  under  the  category  of  assignment  of 
mastery/nonmastery  status.  Such  techniques  reference  individual  scores  to  a  cut-off 
score  as  with  the  mastery/nonmastery  classification  methods.  Berk's  third  category 
can  be  seen  as  an  extension  to  the  assessment  of  mastery-nonmastery  decisions  that 
takes  into  account  a  "sensitivity  to  degrees  of  mastery  and  nonmastery  along  the  score 
continuum  in  addition  to  the  qualitative  master-nonmaster  classification"  (p.246). 

Reliability  assessment  of  tests  used  to  make  mastery  classifications  focuses  on 
decision  consistency  and  the  accuracy  of  the  mastery  allocations  made.  Decision 
consistency  concerns  the  extent  to  which  the  same  decisions  are  made  based  on  two 
different  forms  of  the  test  or  across  two  administrations  of  the  same  test  (Crocker  & 
Algina,  1986).  Techniques  have  been  developed  to  estimate  decision  consistency 
based  on  a  single  administration  of  the  test.  Assessment  of  a  test's  decision  accuracy 
requires  the  estimation  of  the  probabilities  of  false-positive  (assigned  mastery  when 
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actually  a  nonmaster)  and  false-negative  (assigned  nonmastery  when  actually  a 
master)  outcomes.  These  agreement  indexes  are  based  on  the  assumption  of 
classically  parallel  test  forms  (Berk,  1984). 

More  relevant  to  the  issue  of  test  content  theory  is  the  estimation  of  the  reliability 
of  domain  score  estimates.  Berk  (1984)  characterized  domain  score  reliability 
estimates  as  being  "concerned  generally  with  estimating  the  stability  of  an  individual's 
score  or  proportion  correct  in  the  item  domain,  independent  of  any  mastery  standard" 
(p.252).  Algina  and  Crocker  (1986)  discuss  reliability  theory  for  domain  score 
estimates  based  on  generalizability  theory  (Cronbach,  Gleser,  Nanda,  &  Rajaratnam, 
1972).  Generalizability  theory  provides  a  basis  for  investigating  the  extent  to  which  a 
sample  of  measurements  generalizes  to  the  measurement  universe.  Analysis  of 
variance  is  used  to  decompose  score  variance  into  that  attributable  to  various  testing 
conditions  and  that  attributable  to  true  score  variance.  Generalizability  theory  allows 
the  estimation  of  variance  associated  with  test  forms  versus  that  associated  with  the 
examinees.  Randomly  parallel  test  forms  are  assumed  in  generalizability  theory 
(Osbum,  1968). 

Berk  (1984)  also  reviewed  individual-specific  statistics  that  are  defined, 
computed,  and  interpreted  separately  for  each  individual  and  are  used  to  set  a 
confidence  inten/al  around  that  individual's  score.  He  also  discussed  a  method,  not 
based  on  generalizability  theory,  to  compute  an  estimate  of  the  standard  error  of 
measurement. 

Test  Validitv 

Validity  has  been  treated  extensively  In  the  literature.  Cronbach  (1971) 
presented  an  in-depth  treatment  of  test  validity,  making  the  point  that  "One  validates, 
not  a  test,  but  an  interpretation  of  data  aHsing  from  a  specified  procedure"  (p.447).  The 
standards  for  educational  and  psychological  tests  and  testing  (American 
Psychologicai  Association,  et  al.,  1974;  American  Educational  Research  Association, 
et  al.,  1985)  discussed  validity  as  an  inference.  These  documents  discussed  two 
broad  classes  of  validity  questions,  those  dealing  with  inferences  about  what  the  test 
measures  and  questions  about  the  test's  usefulness  as  a  predictor  of  other  variables. 
The  documents  also  presented  a  discussion  the  three  "types"  of  validity  -  content, 
construct,  and  criterion-related.  Test  validation  procedures  are  typically  classified 
within  these  three  categories. 

These  three  types  of  validation  procedures  are  interrelated  and  overlapping, 
with  each  addressing  a  specific  aspect  of  the  test  and  the  interpretation  of  scores  on 
the  test.  Broadly  defined,  content  validity  refers  to  the  extent  to  which  the  content  of  the 
test  represents  the  behavioral  domain  to  be  measured.  Criterion-related  validitv 
reflects  the  effectiveness  of  a  test  in  predicting  a  person's  behavior  in  a  specified 
situation,  either  concurrently  with  the  test  or  in  the  future.  Construct  validitv  is 
concerned  with  the  extent  to  which  a  test  measures  a  theoretical  construct  or  trait.  It  is 
evaluated  by  investigating  the  degree  to  which  certain  explanatory  concepts  account 
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for  performance  on  the  test  (American  Psychological  Association,  et  al.,  1974;  Ameri¬ 
can  Educational  Research  Association,  et  al..  1985;  Cronbach,  1971).  A  detailed 
exposition  of  construct  validity  has  been  outlined  by  Cronbach  and  Meehl  (1955). 


Several  authors  have  called  for  a  more  unified  view  of  test  validity.  In  his 
discussion  of  validity  and  ethics  of  assessment.  Messick  (1980)  discussed  validity  as 
inference  from  evidence  suggesting  that  "different  kinds  of  inferences  from  test  scores 
require  different  kinds  of  evidence,  not  different  kinds  of  validity."  He  suggested  that 
validity  is  the  "general  imperative  in  measurement"  as  the  overall  "degree  of 
justification  for  test  interpretation  and  use"  (p.1014).  Dunnette  and  Borman  (1979) 
have  suggested  that  the  categorization  of  validity  into  "types"  leads  to  a  simplistic  view 
of  the  validity  issue.  Landy  (1986)  made  a  case  for  validation  as  hypothesis  testing. 
However,  his  emphasis  on  investigating  the  predictive  power  of  the  test  was  criticized 
by  Messick  (1989),  who  asserted  that  the  validation  process  should  begin  with  a  focus 
on  explanation,  not  prediction.  The  1985  Standards  for  Educational  and 
Psychological  Testing  (American  Educational  Research  Association,  et  al..  1985) 
presented  validity  as  a  unitary  concept,  referring  to  categories  of  validity  (i.e.,  content- 
related,  criterion-related,  and  construct-related)  rather  than  types  of  validity. 


Content-Related  Validity.  Content-related  validity  has  been  given  treatment  in 
general  texts  on  testing,  such  as  Anastasi  (1982)  and  Cronbach  (1971),  and  in  articles 
dealing  with  issues  specific  to  the  topic.  Ebel  (1956)  discussed  content  validity  as  it 
pertains  to  educational  achievement  tests,  making  the  point  that  "all  types  of  validity 
are  based  ultimately  on  the  content  validity  of  some  measurement  procedures” 
(p.281).  He  suggested  that  the  best  evidence  of  content  validity  is  obtained  through 
the  "detailed,  systematic,  critical  inspection  of  the  test  itself  (p.261). 


The  issue  of  content  validity  has  gotten  much  attention  from  those  involved  in 
personnel  testing  due  to  legal  requirement  to  show  job  relatedness  of  job  selection 
procedures,  as  provided  in  the  Uniform  Guidelines  on  Employee  Selection 
Procedures  (1978).  Lawshe  (1975)  discussed  content  validity  of  personnel  tests, 
suggesting  that  there  was,  at  the  time  of  the  article,  a  "paucity  of  literature  on  content 
validity  in  employment  testing”  (p.563).  He  presented  a  conceptual  framework  in 
which  to  fit  content  validity  into  the  personnel  field,  discussing  the  concept  of  job 
content  validity.  He  also  presented  an  approach  to  the  quantification  of  content 
validity.  Gavin  (1977)  and  Prien  and  Ronan  (1971)  also  discussed  content  validity  as 
applied  to  personnel  testing. 


Lennon  (1956)  outlined  three  assumptions  underlying  the  use  of  content 
validity:  1)  that  the  area  of  concern  to  the  tester  can  be  conceived  as  a  meaningful, 
definable  universe  of  responses,  2)  that  a  sample  can  be  drawn  from  this  universe  in 
some  purposive,  meanin^l  fashion,  and  3)  that  the  sample  and  the  sampling  process 
can  be  defined  with  sufficient  precision  to  enable  the  test  user  to  judge  how  ade- 
(^Jately  performance  on  the  sample  typifies  performance  on  the  universe.  Anderson 
(1972)  stated  that  the  "primitive,  first  requirement  for  a  s^em  of  measurement"  is  that 
there  is  a  dear  and  consistent  definition  of  the  things  to  be  measured  (p.145). 
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Several  authors  have  discussed  what  actually  constitutes  content  validity  if, 
indeed,  it  can  properly  be  considered  as  a  type  or  category  of  validity.  Messick  (1975) 
suggested  the  use  of  the  term  "content  relevance"  or  "content  representativeness" 
instead  of  content  validity.  Linn  (1980a).  in  a  discussion  of  validity  for  criterion- 
referenced  measures,  viewed  content  validity  as  restricted  to  the  items  of  the  test, 
excluding  examinee  responses  to  the  items  from  the  definition.  Additionally,  he  stated 
that  "few  would  consider  content  validity  to  stand  on  an  equal  footing  with  the  other  two 
types  of  validity  in  terms  of  the  rigor  of  the  evidence  that  is  usually  provided  to  support 
a  claim  of  validity"  (p.548).  Hambleton  (1980)  also  discussed  content  validity  as 
pertaining  only  to  the  content  of  the  test  and  not  to  examinee  responses.  Benson 
(1981)  suggested  that  item  writing,  item  format,  test  instructions,  and  item  readability 
be  considered  in  the  content  validity  of  achievement  scores. 

Guion  (1977)  included  both  the  stimulus  and  response  components  of  the  test 
in  the  consideration  of  content  validity  and  suggested  a  set  of  five  minimal  conditions 
for  the  acceptance  of  a  measure  on  the  ba^s  of  content  validity:  1 )  the  content  domain 
must  involve  "behavior  with  a  generally  accepted  meaning"  (p.6).  2)  the  definition  of 
the  domain  must  be  unambiguous.  3)  the  domain  must  be  relevant  to  the  purposes  of 
the  measurement,  4)  "Qualified  judges  must  agree  that  the  domain  has  been 
adequately  sampled"  (p.7),  and  5)  the  measure  must  have  reliability.  In  a  discussion 
of  content  fairness,  Guion  (1978)  pointed  out  that  the  scoring  system  influences  the 
validity  of  the  Inferences  made  and,  thus,  a  representative  sample  of  the  content 
domain  does  not  assure  validity.  He  agrewi  with  Messick  (1975)  and  Tenopyr  (1977) 
that  content  validity  refers  only  to  content-oriented  test  development.  Guion  (1979) 
asserted  that  content  validity  is  a  special  case  of  construct  validity. 

Fitzpatrick  (1983)  reviewed  and  evsduated  the  ways  in  which  test  specialists 
have  defined  content  validity.  She  outlined  six  areas  with  which  content  validity  has 
been  associated:  1 )  the  sampling  adequacy  of  test  content.  2)  the  sampling  adequacy 
of  test  responses.  3)  the  relevance  of  test  content  to  a  content  universe,  4)  the 
relevance  of  test  responses  to  a  behavioral  universe,  5)  the  darity  of  content  domain 
definitions,  and  6)  the  technical  quality  of  test  items.  She  suggested  that  these  are 
definttions  of  concepts  other  than  content  validity,  and  as  no  appropriate  means  of 
defining  content  validity  can  be  determined,  content  validity  is  "not  a  useful  term  for  test 
specialists  to  retain  in  their  vocabulary*  (p.11).  However,  she  suggested  no  alternative 
terminology  to  replace  it. 

Crocker  and  Algina  (1986)  made  the  point  that  "content  validation  is  a  series  of 
activities  that  take  place  after  an  initial  form  of  the  instrument  has  been  developed* 
(p.218).  The  most  common  procedure  for  establishing  content  validity  is  a  matching, 
by  expert  Ridges,  of  test  items  to  the  test  objectives  that  make  up  the  test's  descriptive 
scheme. 

Several  authors  have  recommended  ways  to  approach  this  evaluation.  Katz 
(1958)  suggested  that  test  objectives  be  vmighted  or  ranked  on  importance  prior  to  the 
matching.  Klein  and  Kosekoff  (1975)  suggested  using  a  5-point  scale  to  rate  the 
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importance  of  the  test  objectives.  Authors  have  also  suggested  ways  to  structure  the 
gathering  of  the  matching  information,  such  as  having  the  judges  read  and  respond  to 
each  test  item  before  making  a  rating  (Katz,  1958;  Ebel,  1956;  Klein  &  Kosecoff,  1975; 
Hambleton,  1980;  and  Rovinelli  &  Hambleton,  1977). 

While  useful  to  give  an  indication  of  the  item-test  description  match,  such 
procedures  do  not  assure  a  quality  match  between  the  test  description  and  the 
intended  domain.  Cronbach's  (1971)  duplicate  construction  experiment  addressed 
this  issue  by  comparing  the  scores  on  two  tests  independently  developed  from  the 
same  descriptive  scheme.  This  analysis  gives  an  assessment  of  the  clarity,  and  thus 
the  reliability,  of  the  test  specifications.  Con'elations  between  items  matched  to  the 
same  objectives  can  also  be  considered,  since  items  measuring  the  same  objective 
should  display  at  least  moderate  correlations  (Crocker  &  Algina,  1986).  Cronbach 
(1971)  asserted  that  homogeneous  content  throughout  the  test  is  not  evidence  of 
content  validity  but  may,  instead,  represent  oversampling  of  one  area  of  the  domain. 
However,  to  assess  the  degree  to  which  the  test  measures  the  intended  domain  one 
must  look  beyond  content-related  validity  to  construct-  and  criterion-related  validity. 

Construct-Related  Validity.  In  the  1950's,  the  American  Psychological 
Association  Committee  on  Psychological  Tests  attempted  to  specify  the  qualities  of  a 
test  that  should  be  investigated  before  its  publication  (American  Psychological 
Association,  1954).  The  explication  of  the  concept  of  construct  validity  was  cited  by 
Cronbach  and  Meehl  (1955)  as  the  "chief  innovation  in  the  committee's  report"  (p. 
281 ).  Since  then,  the  use  of  construct  validity  has  been  addressed  by  many  authors 
(e.g.,  American  Psychological  Association,  et  al.,  1974;  Bechtoldt,  1959;  Campbell  & 
Fiske,  1959,  Loevinger,  1957;  Royce,  1963). 

An  outgrowth  of  personality  testing,  construct  validity  is  the  process  of  gathering 
evidence  to  support  a  proposed  interpretation  of  scores  on  a  test.  It  is  most  useful  and 
appropriate  to  investigate  construct  validity  when  the  interest  is  in  what  the  test  actually 
measures,  rather  than  its  predictive  efficiency,  and  when  there  is  no  dear  criterion 
measure  with  which  to  compare  scores  on  the  test.  Without  a  definite  criterion,  the 
Investigator  must  rely  on  indirect  measures  (Cronbach  &  Meehl,  1955).  Messick 
(1975)  defines  construct  validation  as  the  "process  of  marshaling  evidence  in  the  form 
of  theoretically  relevant  empirical  relations  to  support  the  inference  that  an  observed 
response  consistency  has  particular  meaning"  (p.955).  Anastasi  (1982)  pointed  out 
that  construct  validity  is  an  accumulation  of  information  from  any  source  that  would 
provide  insight  into  the  nature  of  the  construct  being  measured.  She  also  noted  that 
construct  validity  is  "a  comprehensive  concept,  that  indudes  the  other  types"  of  validity 
(p.153). 

Cronbach  (1971)  gave  a  rationale  for  the  assertion  that  content  categories,  such 
as  those  used  to  develop  achievement  tests,  are  almost  always  constructs,  as  a 
content  category  represents  a  means  of  organizing  experience.  Tenopyr  (1977) 
distinguished  construct  validity  from  content  validity,  asserting  that  content  validity 
concerns  inferences  about  test  content  whereas  construd  validity  concerns  inferences 
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about  test  scores  Thus,  construct  validity  is  useful  in  investigating  the  validity  of 
inferences  made  from  achievement  test  scores  as  well  as  inferences  made  from 
measures  of  personality  constructs.  However,  construct  validation  has  not  been 
common  in  the  assessment  of  criterion-referenced  tests;  this  is  possibly  due  to  the  lack 
of  variability  often  found  in  criterion-referenced  test  scores  (Hambleton,  1 984b). 

There  is  no  single  way  to  conduct  a  construct  validity  study,  nor  is  there  one 
analysis  procedure  singularly  appropriate  to  the  investigation  of  construct  validity. 
Construct  validity  is  not  reported  in  the  form  of  a  single  statistic;  it  is  a  judgment  based 
on  all  of  the  available  evidence.  Construct  validation  begins  by  stating  the  intended 
use  of  the  test  scores  (Hambleton,  1984b).  This  "definite  statement  of  proposed 
interpretation,"  in  conjunction  with  the  exploration  of  possible  counterhypotheses,  will 
suggest  what  evidence  should  be  collected  to  support  the  interpretation  (Cronbach, 
1971,  p.  483). 

Several  techniques  are  commonly  suggested  for  use  in  the  analysis  of  construct 
validity  (Cronbach,  1971;  Hambleton,  1984b;  Crocker  &  Algina,  1986).  Correlational 
analyses  are  used  to  investigate  the  association  of  test  scores  with  variables  logically 
thought  to  be  related.  Exploratory  factor  analysis  or  confirmatory  factor  analysis  is 
used  to  investigate  whether  the  items  fit  a  hypothesized  structure.  Guttman  scaiogram 
analysis  has  been  suggested  as  a  promising  method  for  use  when  test  objectives  can 
be  arranged  linearly  or  hierarchically.  The  multitrait-multimethod  approach, 
developed  by  Campbell  and  Fiske  (1959),  can  be  used  to  investigate  how  much  of  a 
measure's  variance  can  be  attributed  to  the  trait  being  measured  and  how  much  can 
be  attributed  to  the  method  being  used  to  measure  the  trait.  To  conduct  this  analysis  it 
is  necessary  to  have  multiple  traits  (e.g.,  job  knowledge,  leadership)  measured  by  the 
same  measure  (e.g.,  a  supervisor  rating  form)  and  also  to  have  multiple  types  of 
measures  (e.g.,  supervisor  rating  form,  self  inventory)  used  to  measure  each  trait. 
Similarly,  Kane  (1982)  suggested  the  use  of  analysis  of  variance  components  via  the 
application  of  generalizability  theory  to  investigate  the  dependability  of  test  scores 
across  different  methods  of  measurement.  The  choice  of  analysis  approach  depends 
on  logic  and,  often,  cn  the  availability  of  information. 

Criterion-Related  Validity.  Criterion-related  validity  is,  perhaps,  the  most 
straightforward  of  the  validity  processes.  Investigation  of  criterion-related  validity  is 
interKfed  to  assess  the  "effectiveness  of  a  test  in  predicting  an  individual's  behavior  in 
specified  situations"  (Anastasi,  1982,  p.  137).  A  "criterion"  performance  is  used  to 
assess  the  test's  predictive  power.  For  example,  job  performance  might  be  used  as 
the  criterion  against  which  to  evaluate  an  occupational  aptitude  test,  and  academic 
achievement  is  often  used  to  validate  a  scholastic  aptitude  test.  Determining  a  good, 
reliable,  and  valid  measure  of  the  criterion  performance  is  the  most  problematic 
feature  of  a  criterion-related  validity  study.  For  example,  job  performance  measures 
such  as  supervisor's  ratings  often  reflect  more  than  the  individual's  job  proficiency; 
they  may  reflect  one's  ability  to  get  along  with  one's  boss,  which,  although  important,  is 
not  what  the  aptitude  test  was  intended  to  predict. 


20 


Criterion-related  validity  can  be  either  concurrent  or  predictive,  depending  on 
the  time  lapse  between  test  administration  and  collection  of  criterion  data.  Predictive 
validation  is  most  appropriate  in  selection  and  classification  situations,  such  as  those 
found  in  job  hiring  and  placement  or  academic  selection  situations.  Concurrent 
validation  is  appropriate  in  situations  where  the  test  is  intended  to  perform  a  diagnostic 
function,  such  as  assessment  of  an  individual's  current  psychological  state.  However, 
concurrent  validation  is  often  used  as  a  substitute  for  predictive  validation  when  it  is 
inconvenient  or  impossible  to  collect  data  on  the  criterion  performance  at  a  future  point 
in  time.  In  this  instance,  criterion  data  currently  available  are  used,  or  criterion  data 
are  collected  during  the  same  timeframe  in  which  the  test  is  administered  (Anastasi, 
1982). 
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CHAPTER  ill 


STATEMENT  OF  THE  PROBLEM 

The  development  of  a  test  involves  theoretical,  technical  and  practical  issues.  A 
test  typically  is  designed  to  measure  a  construct,  whether  the  construct  is  achievement, 
ability,  or  skills  in  a  particular  content  domain.  Measurement  of  a  construct  involves 
the  theoretical  issue  of  how  to  define  the  construct.  Constructs  that  are  the  target  of 
measurement  are  rarely  unidimensional;  therefore,  it  is  important  to  consider  how  best 
to  define  the  dimensions  making  up  a  construct,  such  as  the  skills  that  make  up 
reading  ability. 

Theoretical  issues  mesh  with  technical  and  practical  issues  when  one  develops 
a  test  to  measure  a  construct.  Construction  of  reliable,  valid  tests  involves  technical 
issues  (e.g.,  the  proper  development  of  quality  test  items)  as  well  as  practical 
considerations  of  test  administration  (e.g.,  constraints  in  testing  time).  Thus,  test 
development  is  not  merely  an  exercise  in  the  construction  of  test  items  that  appear  to 
be  related  to  the  target  of  measurement  but,  rather,  a  complex  task  that  involves 
theoretical,  technical,  and  practical  considerations. 

One  practical  consideration  that  directly  affects  the  operationalization  of  the 
construct  is  test  length.  If  one  could  measure  completely  the  construct  in  question,  the 
content  of  the  test  could  be  taken  at  face  value  as  covering  the  content  domain.  The 
form  of  the  operationalization  (i.e.,  the  test  format,  item  types,  scoring  procedures,  etc.) 
would  still  be  subject  to  evaluation  in  terms  of  its  appropriateness  to  the  given  content 
domain;  however,  content  domain  representativeness  would  not  be  in  question  as  the 
test  content  universe  would  match  completely  the  content- domain,  and  no  item 
sampling  would  be  involved. 

Obviously,  it  is  rarely  possible  to  cover  the  content  domain  completely  except  in 
instances  where  the  content  domain  is  very  narrowly  defined  (e.g.,  the  addition  of 
single^igit  numbers)  or  where  the  test  content  universe  and  the  test  content  sample 
are  fundamentally  the  same  as  the  content  domain  (e.g.,  a  probationary  period  on  a 
job).  Therefore,  one  must  often  rely  on  samples  of  the  content  domain  (and  associated 
test  content  universe)  to  represent  the  total  content  domain.  The  strategy  used  to 
sample  the  content  domain  directly  influences  the  representativeness  of  the  resulting 
test  content  sample. 

Thus,  one  of  the  most  critical  steps  in  test  construction  is  the  definition  and 
sampling  of  the  content  domain.  Because  this  definition  of  the  content  domain 
represent  a  construct,  there  is  a  theory  of  the  construct  implied  in  the  definition  and  in 
the  strategy  used  to  sample  from  the  domain.  This  is  one  of  the  least  researched  areas 
of  test  construction  and  application. 

The  initial  content  theory  underlying  the  development  of  test  specifications  is 
basic  to  any  inferences  that  are  drawn  from  the  test  scores.  The  amount  of  time,  effort. 
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and  research  that  go  into  developing  the  theory  underlying  the  test  specifications 
varies  widely  from  situation  to  situation;  little  is  known  about  the  impact  of  this  initial 
step  on  the  scores  obtained  with  the  measure.  It  is  unknown  whether  differences  in 
content  theory,  as  reflected  in  the  test  specifications,  have  any  real,  meaningful  impact 
on  the  obtained  test  scores,  or  whether  carefully  constructed  tests  are  robust  enough 
to  provide  meaningful  results  regardless  of  variations  in  the  content  theory  used. 

From  these  issues  the  question  arises:  "What  effects,  if  any,  do  differences  in 
test  content  theory,  as  operationalized  by  the  domain  definition  and  sampling,  have  on 
the  resulting  test  scores  and  the  inferences  that  can  be  made  from  those  scores?"  This 
is  a  question  that  has  meaning  for  test  developers.  The  content  theory  is  typically 
reflected  in  the  weighted  test  outline.  If  no  real  differences  are  found  in  the  validities  of 
tests  based  on  different  content  theories,  then  the  test  constructor  can  feel  secure  that 
with  careful  construction  the  test  will  provide  results  relevant  to  the  content  domain, 
regardless  of  the  specific  test  content  theory  used.  However,  if  there  are  real 
differences  in  the  validities  of  tests  developed  using  different  test  content  theories,  it 
would  point  to  a  need  for  further  research  into  content  theory  development  and  the 
need  for  practical  guidelines  for  test  developers  as  to  how  to  develop  and  test  a 
content  theory  prior  to  the  construction  of  t^  items. 

Therefore,  this  research  investigated  the  effects  of  different  test  content  theories, 
as  reflected  by  different  test  specifications  (the  content  domain  definitions  and 
sampling  strategies),  on  the  test  scores  and  on  the  inferences  that  can  be  made  based 
on  them.  In  general,  rt  was  theorized  that  tests  based  on  a  more  detailed  theory  of  the 
content  area,  as  evidenced  by  the  test  specifications  (strong  theory-based  tests), 
would  yield  more  relevant  and  desirable  results  in  terms  of  reliability,  content  validity 
and  construct  validity  than  would  tests  based  on  a  less  detailed  content  theory  (weak 
theory-based  tests).  It  was  also  hypothesized  that  those  differences  would  become 
more  apparent  as  the  item  sample  size  (test  length)  became  smaller. 

Hyoothesss 

The  specific  hypotheses,  with  regard  to  criterion-referenced  tests,  were: 

1 .  Strong  test  content  theory  provides  the  structure  and  guidance  to  the  domain 
sampling  process  required  to  produce  multiple  forms  of  a  test  that  are  comparable  In 
terms  of  internal  psychometric  properties  (means  and  standard  deviations),  i.e.,  two 
forms  of  a  test  constructed  from  strong  test  content  theory-based  specifications  are 
more  likely  to  have  equal  means  and  variances  than  are  two  forms  of  a  test 
constructed  from  we£U<  test  content  theory-based  specifications. 

2.  Alternate  forms  of  the  same  test  developed  using  strong  test  content  theory- 
based  test  construction  will  be  more  nearly  equivalent,  in  the  context  of  the  classical 
true  score  model,  than  test  forms  developed  using  weak  test  content  theory-based  test 
construction  procedures.  Therefore,  the  correlation  between  scores  on  two  forms  of  a 
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test  (correlation  of  equivalence)  will  be  higher  for  strong  test  content  theory-based 
tests  than  for  weak  test  content  theory-based  tests. 


3.  Test  scores  are  not  generalizable  (in  the  context  of  generalizability  theory) 
across  test  type  (strong  test  content  theory-based  vs.  weak  test  content  theory-based), 
i.e.,  a  test  developed  from  strong  test  content  theory-based  construction  procedures 
will  not  yield  the  same  true  score  for  an  examinee  as  will  a  weak  test  content  theory- 
based  test  measuring  the  same  content  domain. 

4.  Strong  test  content  theory-based  test  construction  produces  tests  with 
evidence  of  better  content  validity  than  does  weak  test  content  theory-based  test 
construction. 

5.  Strong  test  content  theory-based  test  construction  produces  tests  with 
evidence  of  better  construct  validity  than  does  weak  test  content  theory-based  test 
construction. 

6.  As  test  content  sample  size  decreases,  the  relative  efficacy  of  strong  test 
content  theory-based  test  construction  increases,  i.e.,  differences  in  reliability  and 
validity  become  larger. 


Dfltoitipns 

An  achievement  test  is  a  test  that  measures  the  extent  to  which  a  person 
commands  a  certain  body  of  information  or  possesses  a  certain  skill,  usually  in  a  field 
where  training  or  instruction  has  b^en  received. 

The  content  domain  is  the  body  of  knowledge,  skills,  and/or  abilities  identified 
as  the  target  of  measurement.  The  content  domain  should  be  clearly  defined  so  that 
items  of  knowledge  or  particular  tasks  can  be  clearly  identified  as  included  in  or 
excluded  from  the  domain. 

A  criterion-referenced  test  is  a  test  that  allows  users  to  estimate  the  proportion  of 
a  specified  content  domain  that  an  individual  has  mastered. 

A  domain  score  is  the  expected  or  true  percentage  of  items  in  the  test  content 
universe  that  an  examinee  can  answer  correctly. 

A  norm-referenced  test  is  a  test  for  which  the  score  interpretation  is  based  on 
the  comparison  of  a  test  taker's  performance  to  the  performances  of  other  people  in  a 
specified  group. 

The  test  content  universe  is  the  set  of  all  possible  items  of  acceptable  quality, 
either  achjal  or  hypothetical,  that  could  be  developed  for  the  content  domain. 
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A  test  content  sample  is  a  sample  of  items  selected  from  the  test  content 
universe  to  make  up  one  form  of  the  test.  Sample  selection  can  consist  of  direct 
sampling  from  an  actual  test  content  universe  or  indirect  sampling  through  selection  of 
a  sample  of  elements  from  the  content  domain  (knowledges,  skills,  abilities)  and  the 
construction  of  items  to  measure  those  elements. 

Test  content  theory  refers  to  the  rationale,  or  theory,  underlying  the 
development  of  test  specifications  to  guide  test  construction. 

Test  specifications  typically  consist  of  a  content  outline  that  specifies  what 
proportion  of  the  items  shall  deal  with  each  content  area  and  with  each  type  of  skill  or 
ability. 
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CHAPTER  IV 


METHOD 

In  response  to  the  concerns  voiced  by  the  National  Academy  of  Sciences 
(Wigdor  &  Green,  1986)  about  the  methods  used  by  the  Military  Services  to  select  the 
content  of  the  job  performance  measures  used  in  the  Job  Performance  Measurement 
Project,  development  of  a  comprehensive  database  for  investigation  of  questions 
related  to  test  content  selection  was  undertaken.  A  research  team,  consisting  of  the 
author  and  a  research  assistant,  worked  with  subject  matter  experts  to  develop  a  set  of 
test  items  that,  as  nearly  as  possible,  covered  an  entire  selected  content  domain.  The 
Item  set  was  then  administered  to  a  sample  of  subjects.  The  goal  was  to  create  a  test 
item  universe  -  i.e.,  a  set  of  items  covering  a  defined  content  domain,  from  which 
samples  of  Kerns  and  associated  responses  could  be  chosen  and  used  to  address  test 
content  selection  issues. 

The  database  was  also  designed  to  include  data  on  each  examinee  on  factors 
related  to  the  content  domain.  This  required  the  construction  and  administration  of 
rating  forms  and  the  collection  of  background  information.  Aptitude  test  scores  and 
training  school  grades  were  also  included  in  the  database.  Additionally,  the  sample  of 
examinees  was  selected  so  that  data  from  the  Job  Performance  Measurement  Project 
could  be  integrated  into  the  database.  The  present  study  made  use  of  this  database, 
which  is  described  in  more  detail  in  the  following  sections. 

A  comprehensive  job  knowledge  test,  the  job  knowledge  rating  forms  (self  and 
supervisor),  and  the  Experience  and  Training  Rating  Form  were  developed  specifically 
for  use  in  this  study  in  the  winter  and  spring  of  1988.  Data  were  collected  using  these 
instruments  in  the  summer  and  fall  of  1988. 

The  job  performance  test  and  job  proficiency  rating  forms  were  developed  as 
part  of  the  Air  Force  Job  Performance  Measurement  Project.  Data  were  collected 
using  these  instruments  in  the  fall  of  1987  and  have  been  reported  to  Congress.  The 
Job  Performance  Measurement  Project  has  been  documented  in  technical  papers 
(Hedge  &  Lipscomb,  1987;  Lipscomb  &  Hedge.  1988).  Finally,  technical  school 
grades  and  aptitude  test  scores  are  routinely  obtained  on  technical  school  participants 
and  are  available  from  the  technical  training  centers  and  personnel  files  for  research 
purposes. 


Selection  of  a  Content  Domain 

Job  knowledge  of  the  first-term  (1-48  months  of  military  experience)  Aerospace 
Ground  Equipment  (AGE)  General  Mechanic  job  was  chosen  as  the  content  domain 
tor  the  database  development  for  several  reasons.  The  AGE  General  Mechanic  job 
was  in  a  career  field  that  had  been  part  of  the  Job  Performance  Measurement  Project 
and,  therefore,  recent  job  performance  data  were  available  on  some  of  the  personnel 
in  that  job.  Alro,  a  farly  narrow  domain  was  desirable  in  order  to  limK  the  number  of 
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items  necessary  to  cover  the  domain;  the  AGE  General  Mechanic  job  is  relatively 
narrow  in  the  number  and  types  of  tasks  performed.  To  limit  the  domain  further,  it  was 
restricted  to  the  work  typically  performed  by  first-term  airmen  in  the  job.  This  restricted 
the  content  domain  to  more  routine,  proceduralized  tasks  rather  than  the  more 
complex  tasks  and  supervisory  work  performed  by  the  senior  personnel.  Additionally, 
it  was  necessary  that  the  chosen  job  have  a  sufficient  number  of  people  working  in  it  to 
make  data  collection  feasible,  a  criterion  met  by  the  AGE  General  Mechanic  job.  For 
these  reasons,  the  first-term  AGE  General  Mechanic  job  was  chosen  as  the  job  to  be 
used. 


it  was  decided  that  the  testing  vehicle  would  be  a  multiple  choice,  paper-and- 
pendl  test  for  ease  of  administration  of  a  large  number  of  test  items.  Job  knowledge 
was  selected  as  a  domain  parameter  because  its  measurement  is  a  common  practice 
and  it  is  appropriate  for  the  use  of  a  multiple  choice  paper-and-pencil  test  format. 

Subjects 

The  subjects  were  U.  S.  Air  Force  enlisted  personnel  who  perform  the  job  of 
Aerospace  Ground  Equipment  (AGE)  General  Mechanic.  The  majority  of  the  subjects 
were  first  term  airmen  with  1-48  months  of  service;  however,  the  sample  also  included 
some  enlisted  personnel  with  4-20  years  of  experience.  Additionally,  an  effort  was 
made  to  include  in  the  subject  sample  individuals  who  participated  in  the  Job 
Performance  Measurement  Project,  in  which  extensive  job  performance  data  were 
collected  on  these  individuals  (Hedge  &  Teachout,  1986).  Also  participating  in  the 
study  were  subject-matter  experts  (SMEs)  and  the  supervisors  of  the  study  subjects. 
Demographic  information  for  the  total  sample  and  for  the  Job  Performance 
Measurement  Project  sample  are  shown  in  Tables  1  and  2. 


Table  1 


Demographic  Information  for  Total  Sample 


Variable 

SD 

Ranoe 

N  Valid  Cases 

Age 

24.42 

4.63 

18.75-  41.75 

287 

Months  in 

49.58 

49.99 

1.00-239.00 

291 

career  field 

Months  in  senrice 

54.29 

50.72 

6.00-240.00 

291 

Skill  lever 

4.99 

1.24 

3.00-  7.00 

291 

Sex :  Males  -  250  f88.0%1:  Females  =  34  (12.Q%):  284  valid  cases. _ 

*  Skill  level  reflects  level  of  proficiency.  A  3-level  is  an  apprentice,  a  5-level  is  a 

journeyman,  and  a  7-level  is  a  master. 
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Table  2 


Demographic  Information  for  Job  Performance  Measurement  Sample 


Variable 

_  Mean 

SD 

Ranae 

N  Valid  Cases 

Age 

22.76 

1.91 

19.75-28.25 

79 

Months  in 

32.84 

10.58 

17.00-60.00 

80 

career  field 

Months  in  service 

36.49 

9.92 

18.00-64.00 

80 

Skill  level 

5.00 

0.00 

5.00 

80 

Sex:  Males  »  67  (85.9%);  Females  = 
li«81 

1 1  (14.1%);  78  valid  cases. 

If^struments 

Comprehensive  Job  Knowledge  Test.  The  test  was  constructed  using  a  listing 
of  each  of  the  119  tasks  routinely  performed  by  AGE  General  Mechanic  firsHermers. 
This  listing  was  taken  from  the  most  recent  Air  Force  Occupational  Survey  Report  for 
the  AGE  career  field,  a  report  of  the  task-level  job  analysis  conducted  on  the  career 
field  (Christal,  1974).  These  119  tasks  account  for  70%  of  the  time  spent  by  people  in 
that  job.  Due  to  time  and  expense  constraints,  tasks  that  were  performed  by  a  small 
percentage  of  people  and  that  accounted  for  very  little  of  the  time  spent  on  the  job 
were  not  indud^  in  the  listing.  Thus,  the  goal  of  producing  a  test  content  universe  for 
the  content  domain  could  not  be  fully  realized.  However,  it  was  still  possible  to  cover 
the  content  domain  in  a  reasonably  comprehensive  way.  The  task  list  was  reviewed 
by  subject  matter  experts  (SMEs)  for  both  accuracy  and  completeness  of  coverage  of 
the  defined  content  domain.  The  tasks  in  the  task  list  are  shown  in  Appendix  A. 

A  detailed  task  analysis  was  conducted  for  each  task  through  which  the  task 
was  broken  down  into  its  component  subtasks;  subject  matter  experts  defined,  for  each 
subtask,  the  job  knowledges  required  to  perform  the  subtask.  Technical  orders  and 
job  guides  were  used  as  reference  material.  The  categories  of  supporting  knowledges 
outlined  in  The  Task  Analysis  Handbook  (e.g.,  primary  factual  knowledge,  knowledges 
prerequisite  to  skilled  performance)  (DeVries,  Eschenbrenner,  &  Ruck.  1960)  were 
used  as  references  to  identify  and  define  the  required  knowledges.  It  should  be  noted 
that  more  than  one  task  required  the  same  knowledges.  Appendix  A  shows  the  tasks 
and  the  associated  knowledges  identified. 

One  job  knowledge  test  item  was  then  written  for  each  of  the  376  job 
knowledges  identified.  Appendix  A  also  shows  the  number  of  the  test  item  developed 
for  each  knowledge.  The  task  list,  job  knowledges,  and  job  knowledge  test  items  were 
extensively  reviewed  and  revised  by  independent  groups  of  SMEs  in  workshops  at 
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several  Air  Force  bases.  The  job  knowledge  test  was  constructed  and  then  pretested 
and  revised  twice.  Appendix  B  contains  examples  of  job  knowledge  test  items. 


Job  Knowledge  Rating  Forms.  Rating  forms  were  constructed  for  supervisors  to 
use  in  rating  subjects  and  for  each  subject  to  use  in  self  rating  on  the  subject's  degree 
of  job  knowledge.  To  give  an  overview  of  the  job  content  domain,  eight  general  areas 
covering  the  job  were  described  and  a  five-point  scale  was  provided  on  which  to  make 
ratings  of  each  area.  The  eight  areas  were  provided  to  give  the  rater  a  frame  of 
reference  and  were  defined  by  SMEs  usi.  j  occupational  survey  report  data  as 
reference.  The  lowest  level  of  knowledge  corresponded  to  ”1”  on  the  scale,  and  "5" 
represented  the  highest  level.  The  rating  forms  were  pretested  and  revised  with  the 
job  knowledge  test.  The  rating  forms  are  shown  in  Appendix  C. 

Experience  and  Training  Rating  Form.  A  form  using  the  eight  job  areas  from  the 
job  knowledge  rating  forms  as  reference  points  was  constructed  to  collect  ratings  from 
subjects  on  their  overali  levels  of  training  and/or  ievels  of  experience  in  their  job.  A 
five-point  scale  was  used,  with  "1”  representing  a  iow  level  of  training/experience  and 
”5"  representing  a  high  level.  The  Experience  and  Training  Form  was  pretested  along 
with  the  Comprehensive  Job  Knowledge  Test  and  other  rating  forms.  The  Experience 
and  Training  Form  is  also  shown  in  Appendix  C. 

Job  Performance  Test  and  Rating  Forms.  As  a  part  of  the  Air  Force  Job 
Performance  Measurement  Project  a  job  performance  test  and  job  proficiency  rating 
forms  were  developed  for  the  AGE  career  field  (Hedge  &  Lipscomb,  1987;  Hedge, 
Lipscomb,  &  Teachout,  1988;  Lipscomb  &  Hedge,  1988).  The  Walk-Through 
Performance  Test,  a  job  performance  test,  consisted  of  work  sample  items  that 
required  the  hands-on  performance  of  certain  specified  tasks  and  intenriew  items  that 
re(Hiired  the  examinee  to  explain,  or  lalk-through,"  task  performance  of  certain  other 
tasks.  The  subject's  total  score  on  ail  the  items  was  his/her  Walk-Through 
Performance  Test  score. 

Also,  forms  were  developed  to  gather  ratings  of  overall  technical  proficiency 
from  each  examinee's  peers  and  supervisor,  and  to  gather  a  self-rating  from  each 
examinee.  A  5-point  scale  was  used,  with  ”5"  representing  a  high  level  of  technical 
proficiency  and  "1”  representing  a  iow  level.  Behavioral  descriptors  were  provided  for 
e«^  scalar  point  on  the  rating  form. 

Training  Performance  and  Job  Aptitude  Measures.  Tests  administered  through¬ 
out  the  17-week  AGE  career  field  techniced  school  were  used  as  measures  of  training 
performance.  The  Mechanical  Aptitude  Composite  of  the  Armed  Services  Vocational 
Aptitude  Battery  (ASVAB)  was  used  as  a  measure  of  aptitude  for  the  job.  Also  used  as 
aptitude  indicators  were  the  10  ASVAB  subtests,  the  Verbal  Composite,  the  General 
Composite,  the  Electronic  Composite,  and  the  Armed  Forces  Qualifying  Test 
Composite. 


Data  CoHectign 


The  376-item  Total  Domain  Coverage  Job  Knowledge  Test,  Job  Knowledge 
Self  Rating  Form,  and  Experience  and  Training  Rating  Form  w«re  administered  to  294 
AGE  personnel  from  AGE  shops  at  10  different  Air  Force  bases  in  the  continental 
United  States.  Job  Knowledge  Supervisor's  Rating  Forms  were  completed  by  their 
supervisors.  Three  trained  test  administrators  separately  visited  the  10  Air  Force 
bases  and  administered  the  tests  to  subjects  in  group  testing  sessions. 

Two  test  booklets  were  constructed.  Each  booklet  contained  all  376  of  the 
items;  the  booklets  differed  only  in  the  ordering  of  the  items.  Item  order  was  random 
within  each  booklet  and  different  across  the  two  booklets.  About  half  the  subjects  used 
one  booklet,  the  other  half  used  the  other  booklet.  Additionally,  the  test  administrators 
instructed  subjects  in  half  of  their  sessions  to  complete  items  189  through  376  first  and 
complete  items  1  through  188  second.  These  measures  were  taken  to  counterbalance 
any  fatigue  effects.  Test  responses  and  subject  background  information  were 
recorded  by  test  participants  on  an  optical  scan  response  sheet. 

Job  Knowledge  and  Experience  and  Training  Rating  Fonm  data  were  collected 
from  subjects  after  the  Job  Knowledge  Test  was  administered.  Rating  Form  responses 
were  recorded  by  subjects  and  supervisors  in  a  rating  form  booklet.  Supervisor  Rating 
Forms  were  distributed  by  the  test  administrators  and  were  self-administered. 

During  the  Job  Performance  Measurement  Project  data  collection,  job 
performance  tests  were  administered  to  subjects  individually  by  trained  test 
administrators  at  the  subject's  work  site  (Hedge,  Lipscomb.  &  Teachout,  1988).  Test 
administration  took  approximately  3-4  hours.  Seif,  supervisor,  and  peer  ratings  of  the 
subject's  overall  technical  proficiency  were  collected  in  group  sessions  following  a 
rater  training  session.  The  rater  training  session  was  conducted  to  train  raters  to  make 
accurate  and  unbiased  ratings. 

Data  on  subjects'  performances  in  technical  school  were  obtained  from  the 
technical  school  files,  and  aptitude  test  scores  were  obtained  from  the  personnel 
database.  Technical  school  performance  was  reflected  by  the  subject's  mean  test 
scores  throughout  technical  training  for  the  AGE  career  field. 

Data  Analysis 

The  data  described  in  the  previous  sections  were  integrated  into  a  database  for 
use  in  this  study.  These  data  were  analyzed  to  investigate  the  hypotheses  stated  in 
Chapter  ill.  The  Comprehensive  Job  Knowledge  Test  served  as  an  item  pool  from 
which  items  were  selected  to  create  tests  that  represented  the  factors  used  in  this 
study.  The  two  factors  in  this  study  were  test  type,  as  determined  by  the  test  content 
theory  used  in  test  construction,  and  test  length  (test  content  sample  size).  Differences 
in  test  scores  that  were  hypothesized  as  attributable  to  these  factors  were  investigated. 


31 


Two  domain  sampling  strategies  were  developed.  One  strategy  reflected  a 
weak  test  content  theory-based  approach  to  test  construction.  The  other  strategy 
represented  a  strong  test  content  theory-based  approach.  These  two  approaches 
were  used  to  select  items  from  the  Comprehensive  Job  Knowledge  Test  to  create 
sample  tests  that  served  as  exemplars  of  the  two  approaches  to  test  construction. 
Because  this  approach  was  intended  to  simulate  criterion-referenced  test  construction, 
test  response  characteristics  were  not  used  in  item  selection.  This  decision  was  based 
on  the  idea  that  criterion-referenced  tests  should  be  developed  to  represent  the 
content  domain  and  that  inclusion  or  exclusion  of  items  based  on  item  characteristics 
can  distort  that  representation.  Also,  in  most  test  construction  situations  test  responses 
are  not  available  during  test  construction. 

Because  the  issue  of  test  length  was  being  investigated,  tests  of  various  lengths 
were  developed  for  each  test  type  (strong  test  content  theory-based  and  weak  test 
content  theory-based).  Test  lengths  of  100,  50.  25,  12,  and  6  items  were  used  to 
represent  the  common  range  of  lengths  found  in  criterion-referenced  tests. 

After  the  items  to  be  included  in  each  sample  test  were  identified,  each 
examinee's  responses  for  those  test  items  were  extracted  from  the  database  to 
represent  how  the  examinee  would  have  responded  to  that  test.  Examinee's  percent 
correct  scores  based  on  these  responses  were  computed,  giving  each  examinee  a 
score  on  each  of  the  different  sample  tests.  These  scores  simulate  what  the 
examinee's  performance  on  the  sample  tests  would  have  been  had  he/she  been 
administered  each  test  as  a  separate  entity. 

These  test  scores  were  analyzed  to  investigate  the  six  stated  hypotheses.  The 
test  scores  represented  tests  of  various  lengths,  constructed  by  two  different 
approaches,  deigned  to  measure  the  same  content  domain.  The  six  hypotheses  and 
the  analyses  used  to  investigate  them  dealt  with  the  areas  of:  1 )  reliability  of  test 
specifications,  2)  content-related  validity,  and  3)  construct-related  validity.  The  first 
three  hypotheses  dealt  with  the  reliability  of  test  specifications,  i.e.,  do  the 
^>ecifications  used  to  construct  the  test  provide  sufficient  guidance  such  that  alternate 
forms  of  the  same  test  are  interchangeable.  The  fourth  hypothesis  addressed  the 
issue  of  content  validity,  i.e.,  how  representative  is  the  test  of  the  content  domain.  The 
fifth  hypothesis  dealt  with  construct  validity.  The  sixth  hypothesis  cut  across  the  three 
areas  by  dealing  with  the  impact  of  test  length  on  each  area.  A  description  of  the 
domain  sampling  strategies  employed  and  a  summary  of  the  analyses  to  address  the 
hypotheses  follow. 

Domain  Samolino  Strategies 

Weak  Test  Content  Theorv-Based  Tests.  Tests  that  consisted  of  a  random 
sample  of  the  Comprehensive  Job  Knowledge  Test  items  were  developed  to  represent 
a  weak  test  content  theory-based  approach  to  test  development.  Random  sampling  of 
items  was  attained  by  use  of  a  random  number  generator  to  select  the  item  numbers  of 
items  to  be  included  in  each  sample  test.  Each  sample  was  an  independent  sample. 
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One  would  expect  that,  over  repeated  samplings,  the  percentage  of  test  items 
selected  from  each  content  area  would  equal  the  percentage  of  test  items  associated 
with  each  content  area.  That  value  was  calculated  for  each  content  area.  As 
previously  mentioned,  some  items  reflected  knowledges  required  in  more  than  one 
content  area.  Those  overlapping  test  items  were  counted  for  each  content  area  they 
were  associated  with.  Percentage  values  for  each  content  area  were  calculated  by 
dividing  the  number  of  items  associated  with  the  content  area  (including  overlap  items) 
by  the  total  number  content  area-item  associations  counted  across  all  content  areas 
(including  overlaps)  and  multiplying  by  100.  The  item  sampling  pattern  that  could  be 
expected  over  repeated  instances  of  random  sampling  is  shown  in  Table  3. 


Table  3 

Expected  Distribution  of  Items  Across  Content  Areas 


Content  area 


Percentage 

-gf  items  ■ 


1 .  Maintaining  forms,  records,  and  publications  5.32 

2.  Performing  visual  and  service  inspections  7.98 

3.  Performing  periodic  inspections  14.64 

4.  Maintaining  AGE  electrical  and  electronic  systems  1 1 .60 

5.  Maintaining  AGE  engines,  motors,  and  generators  32.13 

6.  Maintaining  AGE  hydraulic  systems  4.18 

7.  Maintaining  AGE  pneumatic  systems  5.70 

8.  Maintaining  AGE  enclosures,  chassis,  and  drives  1 2.74 

9.  Dispatching  AGE  5.13 

10.  Maintaining  special  tools,  shop  equipment  supplies  and  facilities  .57 


Strong  Test  Content  Theorv-Based  Tests.  The  strong  test  content  theory-based 
approach  to  test  development  was  a  weighted  outline  process  based  on,  but  not 
identicai  to,  the  approach  used  to  select  content  for  measures  in  the  Air  Force  Job 
Pertormance  Measurement  Project  (Lipscomb,  1984;  Lipscomb  &  Dickinson,  1988). 
The  content  domain  was  organized  into  content  areas  within  which  job  tasks  and 
associated  job  knowledges  fall,  based  on  occupational  analysis  information  as 
reported  in  the  most  recent  occupational  survey  report  for  the  career  field  and  SME 
judgment. 

Content  area  weights  were  developed  based  on  a  testing  emphasis  algorithm 
that  computed  the  product  of  task-level  SME  ratings  of  trsuning  emphasis,  occupational 
survey  information  on  the  percent  of  individuals  in  the  career  field  performing  the  task, 
and  the  average  relative  time  spent  on  the  task.  Expert  judgment  data  were  available 
at  the  task  level  on  these  factors  (Christsd,  1975;  Christal  &  Weismuller,  1976).  The 


raw  weights  were  summed,  and  then  each  raw  weight  was  divided  by  the  total  of  the 
raw  weights  and  multiplied  by  100  to  give  a  percentage  weight  for  each  content  area. 
Items  were  selected  for  each  sample  test  using  a  random  stratified  procedure 
reflecting  the  outline  of  the  content  domain  and  the  associated  weights.  Table  4 
shows  the  content  areas,  associated  weights,  and  the  number  of  items  to  be  selected 
for  each  length  test. 

A  comparison  of  the  values  in  Tables  3  and  4  indicated  that  there  were 
differences  in  the  item  sampling  that  would  occur  with  repeated  sampling  using  a 
weak  test  content  theory-based  versus  the  strong  test  content  theory-based  test 
specifications.  Those  differences  were  reflected  primarily  in  three  content  areas. 
Areas  3,  4,  @nd  8,  on  the  specifications  charts.  Other  content  areas  showed  smaller 
differences.  Although  the  overall  differences  were  not  extreme,  they  were  sufficient  to 
investigate  the  sensitivity  of  test  outcome  to  differences  in  specifications.  Overall 
similarities  in  sampling  are  to  be  expected  between  two  strategies  that  reflect  the 
salient  features  of  a  content  domain.  Of  course,  there  is  no  way  to  know,  a  priori,  what 
a  single  random  sampling  of  test  items  would  look  like.  Additionally,  it  was  the  results 
of  actual  tests  developed  using  these  two  methods  that  were  of  primary  concern. 


Table  4 


Strong  Test  Content  Theorv-Based  Test  Si 


liona 


Content  area 

Percentage 

weight 

Number  of  test  items 

100 

-5CL 

lestteDotb 
-  25  -12.,- 

1.  Maintaining  forms,  records,  publications 

4 

4 

2 

1 

1 

1 

2.  Performing  visual  and  service 

7 

7 

4 

2 

1 

1 

3.  Performing  periodic  inspections 

4 

4 

2 

1 

0 

0 

4.  Maintaining  AGE  electrical  and 

17 

17 

8 

4 

2 

1 

5.  Maintaining  AGE  engines,  motors, 
and  generators 

35 

35 

17 

9 

4 

2 

6.  Maintaining  AGE  hydraulic  systems 

3 

3 

2 

1 

0 

0 

7.  Maintaining  AGE  pneumatic  systems 

4 

4 

2 

1 

1 

0 

6.  Maintaining  AGE  enclosures,  chassis. 

22 

22 

11 

5 

3 

1 

9.  Dispatching  AGE 

3 

3 

2 

1 

0 

0 

10.  Maintaining  special  tools,  shop 
equipment  supplies  and  facilities 

1 

1 

0 

0 

0 

0 
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Analyses  that  focused  on  the  issue  of  the  reliability  (or  adequacy)  of  test 
specifications  were  concerned  with:  1)  whether  a  set  of  test  specifications  provide 
sufficient  guidance  and  structure  to  the  domain  sampling  process  such  that  two  tests 
developed  from  the  same  set  of  test  specifications  will  provide  comparable 
information,  and  2)  whether  test  scores  are  generalizable  across  test  types.  Thus, 
these  analyses  investigated  whether  scores  obtained  on  the  different  samples  of  test 
items  are  generalizable  both  within  the  test  type  and/or  across  test  types. 

To  address  the  first  question  of  the  reliability  of  test  specifications,  two  randomly 
parallel  forms  of  each  test  type/test  length  combination  were  needed.  Because  of  the 
breadth  of  the  Comprehensive  Job  Knowledge  Test  and  the  number  of  items  in  it, 
multiple  forms  of  the  two  test  types  could  be  constructed.  For  each  test  type,  the  two 
forms  were  analyzed  to  compare  the  psychometric  properties  of  the  randomly  parallel 
sample  tests.  Differences  in  means,  standard  deviations,  homogeneities,  and 
frequency  distributions  were  noted. 

To  investigate  the  degree  to  which  tests  developed  by  the  same  strategy 
provide  the  same  information  (i.e.,  the  same  rank  ordering  of  subjects)  across  varying 
item  sample  sizes,  intercorrelations  of  test  scores  within  test  construction  method  were 
computed.  Correlations  of  equivalence  were  computed  between  scores  on  A  and  B 
forms  of  the  same  length.  This  was  intended  to  approximate  for  each  test  type 
Cronbach's  (1971)  "duplicate  experiment*  that  called  for  the  comparison  of  forms  of  a 
test  constructed  by  independent  test  developers  from  the  same  set  of  test 
specifications.  The  degree  of  equivalence  between  forms  was  seen  as  a  function  of 
the  quality  of  test  specifications.  The  underlying  premise  of  the  analysis  was  that  test 
specifications  should  be  so  well-defined  that  different  test  developers,  using  the  same 
set  of  specifications,  will  develop  equivalent  forms. 

Analysis  of  the  second  question  of  reliability  of  test  specifications  is  the  issue  of 
the  generalizabiiity  of  scores  across  test  types.  Generalizability  theory  explicitly  recog¬ 
nizes  the  existence  of  multiple  sources  of  error  variance  and  provides  methods  for 
simultaneously  estimating  each  (Kraiger,  1989).  Generalizability  theory  allows  the 
researcher  to  identify  factors  affecting  measurement  (facets)  and  to  estimate  the 
contribution  of  each  factor  to  total  score  variance.  Generalizability  theory  analyses 
(Brennen,  1983;  Shavelson,  1986)  were  conducted  to  investigate  the  reliability/ 
generalizability  of  scores  obtained  across  the  different  test  types  and  the  various  test 
lengths.  These  analyses  investigated  whether  or  not  tests  yield  the  same  scores  for 
examinees  regardless  of  the  test  type  used  to  obtain  the  scores.  One  of  the  two 
randomly  parallel  forms  for  each  test  type/sample  size  combination  that  were 
generated  for  the  previous  analyses  was  selected  at  random  for  use  in  this  analysis 
arxl  the  following  analyses.  Variance  components  for  the  person  facet,  test  type  facet, 
and  the  interaction  effects  were  estimated. 
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The  content-related  validity  of  the  sample  tests  was  investigated.  Examinees' 
scores  on  the  sample  tests  were  correlated  with  their  scores  on  the  total 
Comprehensive  Job  Knowledge  Test.  Scores  on  the  total  Comprehensive  Job 
Knowledge  Test  were  used  to  represent  the  best  estimate  of  the  individual's  domain 
score. 

Construct-Related  Validity  Analyses 

Because  construct  validation  is  the  process  of  information  gathering  and 
hypothesis  testing,  a  series  of  analyses  was  conducted.  Variables  previously 
demonstrated  to  be  related,  or  logically  assumed  to  be  related,  to  job  knowledge  were 
iderrtified  and  used  in  these  analyses. 

Based  on  the  hypothesis  that  a  job  knowledge  test  score  should  be  related  to 
judgments  of  an  individual's  level  of  knowledge,  the  correlations  of  the  sample  test 
scores  with  self  and  supervisor  ratings  of  job  knowledge  were  computed. 

To  investigate  a  model  of  factors  related  to  job  knowledge,  a  hierarchical 
regression  analysis  was  conducted  for  the  Comprehensive  Job  Knowledge  Test  and 
for  each  sample  test.  No  causal  relationships  between  job  knowledge  and  the 
construct  variable  were  assumed;  only  the  level  of  association  was  investigated.  Data 
for  the  regression  analysis  were  available  on  a  subset  of  the  subject  sample,  those 
who  participated  earllsr  in  the  Air  Force  Job  Performance  Measurement  Project.  The 
construct  variables  reflected  the  hypothesis  that  job  knowledge  is  related  to  individual 
aptitude  for  the  job,  training  and  experience  on  the  job.  and  job  performance. 
Additionally,  correlations  of  scores  on  the  tests  with  a  variety  of  aptitude  indices  were 
investigated. 

Finally,  it  was  assumed  that  job  knowledge  increases  with  job  experience. 
Therefore,  a  comparison  was  made  between  scores  of  novices  (1-24  months  on  the 
job)  and  experts  (over  7  years  on  the  job)  on  the  Comprehensive  Job  Knowledge  Test 
and  on  each  of  the  sample  tests. 
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CHAPTER  V 


RESULTS 

Reliability  of  Test  Specilifialigns 

As  previously  described,  a  series  of  analyses  were  conducted  to  address  the 
first  three  hypotheses  that  dealt  with  the  question  of  the  reliability  of  the  test 
specifications.  Two  randomly  parallel  forms  were  generated  for  each  item  sample  size 
of  each  test  type,  creating  a  Form  A  and  Form  B  for  each  sample  size  of  each  test  type. 
Test  scores  (percent  correct)  were  computed  for  each  subject  on  each  of  the  10 
sample  tests  and  the  Total  Test.  Tests  representing  weak  test  content  theory-based 
construction  are  coded  as  TR  (for  random  design  test)  in  the  following  tables.  Tests 
representing  strong  test  content  theory-based  construction  are  labelled  TJ  (for 
judgment  design  test).  The  Comprehensive  Job  Knowledge  Test  is  labelled  Total 
Test 


Hypothesis  One  dealt  with  the  stability  of  test  psychometric  properties  across 
different  forms  of  the  same  test.  Table  5  shows  the  means  and  standard  deviations  of 
each  sample  test.  Tests  of  differences  between  correlated  pairs  of  means  were 
conducted  between  test  forms.  It  was  expected  that  strong  test  content  theory  tests 
(TJ)  would  display  less  variation  in  means  between  A  and  B  forms  than  would  weak 
test  content  theory  tests  (TR).  Differences  between  A  and  B  forms  were  significant  for 
€di  pedrs  of  tests  across  both  test  types,  with  the  exception  of  the  TR25  tests.  With  the 
exception  of  the  TR100  test,  the  differences  between  form  means  for  the  TR  tests  were 
smaller  in  magnitude  than  the  differences  between  form  means  for  the  TJ  tests.  Thus, 
these  results  do  not  support  Hypothesis  One. 

In  the  context  of  Hypothesis  One,  it  was  expected  that  there  would  be  less 
variation  between  forms  in  the  properties  of  internal  consistency,  skewness,  kurtosis, 
and  range  for  tests  constructed  using  strong  test  content  theory  than  for  tests 
constructed  using  weak  test  content  theory.  Visual  inspection  of  Table  6  showed  no 
systematic  difference  between  test  types  in  the  agreement  between  A  and  B  forms  in 
internal  consistency  or  in  test  score  distribution  indices.  Internal  consistency  was 
moderately  high  in  the  100  item  test  of  both  types,  a)mparing  favorably  with  the 
internal  consistency  of  the  Total  Test.  As  expected,  internal  consistency  was  reduced 
as  the  test  length  decreased.  However,  the  data  do  not  suggest  that  the  TJ  tests  were 
more  reliable  across  forms  in  terms  of  internal  psychometric  properties  than  the  TR 
tMts.  Therefore.  Hypothesis  One  was  not  supported  by  these  data. 


37 


Table  5 


Psychometric  Properties  of  Randomly  Parallel  Sample  Tests  and  Total  Test  -  Means 
and  Standard  Deyiations 


Test  type 

Mean 

t 

TR100 

74.51 

9.14 

71.31 

10.02 

12.74* 

TR50 

75.52 

11.04 

72.33 

10.83 

7.29* 

TR25 

68.68 

10.26 

70.38 

12.35 

>2.65 

TR12 

66.13 

15.59 

70.55 

15.30 

-4.74* 

TR6 

82.60 

18.38 

76.13 

17.74 

5.12* 

TJ100 

71.02 

9.54 

72.52 

9.48 

-5.28* 

TJ50 

71.37 

10.00 

67.80 

9.76 

8.17* 

TJ25 

75.10 

11.31 

79.85 

11.02 

-8.47* 

TJ12 

72.05 

14.98 

64.14 

14.89 

8.80* 

TJ6 

81.01 

17.12 

67.91 

19.92 

9.98* 

Total  Test 

71.78 

8.93 

Note.  TR«Random  design  test;  TJ^nJudgment  design  test.  294 . 


The  data  shown  in  Tables  7  and  8  address  the  second  hypothesis,  which  held 
that  tests  developed  using  strong  test  content  theory  would  be  more  nearly  equivalent 
across  forms  than  tests  developed  using  weak  test  content  theory.  Shown  are  the 
interoorrelations  of  test  scores  for  both  the  TR  and  the  TJ  sample  tests.  Correlations  of 
equivalence  were  computed  between  forms  of  the  tests.  Additionally,  all  scores  within 
test  type  were  intercorrelated  to  assess  the  level  of  agreement  between  the  various 
form/tongth  combinations  for  a  test  type.  It  was  expected  that  the  TJ  tests  would  show 
higher  positive  correlations  of  equivalence  and  higher  intercorrelations  overall. 
ComfMmson  of  the  correlations  of  equivalence  shown  for  the  TR  tests  and  those  shown 
for  the  TJ  tests  indicate  a  high  degree  of  similarity  in  magnitude  and  pattern.  A  and  B 
forms  of  the  100-item  tests  of  both  types  were  highly  correlated,  indicating  that  the  two 
test  forms  were  rank  ordering  subjects  much  the  same.  As  was  to  be  expected,  for 
both  test  types  correlations  between  A  and  B  test  forms  decreased  in  magnitude  as 
test  length  decreased.  In  general,  correlations  between  sample  tests  of  the  same  test 
type  were  moderate,  ranging  from  .28  to  .90  for  the  TR  tests  and  from  .27  to  .87  for  the 
TJ  tests.  Thus,  no  consistent  differences  were  obsen/ed  in  the  magnitudes  or  patterns 
of  correlations  when  the  values  in  Tables  7  and  8  were  compared.  Therefore,  TJ  tests 
not  show  more  agreement  across  test  forms  and  lengths,  and  Hypothesis  Two  was 
not  supported. 
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Table  6 


Psychometric  Properties  of  Randomly  Parallel  Sample  Tests  and  Total  Test  -  Aloha, 
Skewness.  Kurtosis.  and  Range 


Aloha _ Skewness  Kurtosis _ Range 

Form  Form  Form  Form 

3^ _ A _ B _ A _ B _ A  B _ A _ B 


TR100 

.83 

.84 

-.42  -.46 

-.19 

-.13 

47.00-  94.00 

41.00-  92.00 

TR50 

.75 

.72 

-.54  -.78 

-.08 

.90 

40.00-  98.00 

34.00-  96.00 

TR25 

.39 

.57 

-.46  -.47 

.06 

.24 

36.00-  88.00 

24.00-  96.00 

TR12 

.40 

.38 

-.39  -.36 

-.12 

.09 

16.67-100.00 

16.67-100.00 

TR6 

.36 

.24 

-1.03  -.45 

.65 

-.02 

16.67-100.00 

16.67-100.00 

TJ100 

.82 

.83 

-.44  -.26 

.04 

-.39 

40.00-  92.00 

46.00-  92.00 

TJ50 

.67 

.67 

-.23  -.54 

-.03 

.16 

40.00-  96.00 

42.00-  90.00 

TJ25 

.54 

.53 

-.46  -.64 

-.27 

.16 

44.00-  96.00 

44.00-100.00 

TJ12 

.43 

.40 

-.46  -.50 

.12 

.34 

16.67-100.00 

16.67-100.00 

TJ6 

.24 

.33 

-.61  -.48 

-.51 

-.20 

33.33-100.00 

16.67-100.00 

Total  .95  -.51  .06  43.09'91.22 

Test 


Note.  TR«Random  design  test;  TJ»Judgment  design  test.  294. 


Using  the  A  forms,  a  generalizabiiity  theory  analysis  was  conducted  to 
inv^gate  the  reliability/generalizability  of  scores  obtained  across  the  different  test 
types  and  various  test  lengths,  as  addressed  in  Hypothesis  Three.  As  shown  in  Table 
9,  variance  components  were  estimated  for  the  person  facet,  test  type  facet,  test  length 
focet,  and  the  interaction  effects.  Variance  component  values  of  (0.0)  are  shown  in  the 
table  indicating  that  estimated  variance  components  for  those  factors  were  negative 
even  though,  by  definition,  variance  components  are  nonnegative  (Brennan,  1983). 
This  result  is  not  uncommon  with  small  sample  sizes  due  to  sampling  variability.  The 
negative  value  was  replaced  with  0.0  to  avoid  biasing  other  variance  components. 
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Table  7 


Intercorrelations  of  Weak  Test  Content  Theory-Based  Sample  Tests 


TR 

TR 

TR 

TR 

TR 

TR 

TR 

TR 

TR 

TR 

6A 

_efi 

TR100A 

TR50A 

.85 

TR25A 

.68 

.65 

TR12A 

.64 

.59 

.46 

TR6A 

.54 

.53 

.38 

.34 

TR100B 

.90 

.68 

.62 

.50 

TR50B 

.82 

.58 

.57 

.50 

.82 

TR25B 

.75 

.69 

.57 

.41 

.72 

.68 

TR12B 

.65 

.57 

.43 

M 

.35 

.64 

.57 

.48 

TR6B 

.52 

.49 

.40 

.37 

.28 

.50 

.46 

.44 

.37 

Note.  Correlations  of  equivalence  are  underlined. 
All  correlations  significant  at  (p<.001 )  level. 
ti-294. 


Table  8 

Intercorrelations  of  Strong  Test  Content  Theorv»Based  Sample  Tests 
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In  the  context  of  Hypothesis  Three,  it  was  expected  that  there  would  be  a 
relatively  large  variance  component  associated  with  test  type,  which  would  indicate 
that  test  scores  cannot  be  generalized  across  test  types.  Comparison  of  the  relative 
contributions  of  the  estimated  variance  components  to  the  total  variance  indicated  that 
the  undifferentiated  error  factor  (pti)  aa:ounted  for  the  most  variance.  The  person 
factor  was  the  next  largest  source  of  variance:  this  variance  is  desirable,  as  it  indicates 
that  individual  differences  in  test  responses  had  a  strong  influence  on  test  scores.  Test 
length  had  the  next  largest,  though  relatively  small,  variance  component,  indicating 
that  the  length  of  the  test  affected  the  reliability  of  the  test  scores  and  that  test  scores 
from  one  length  test  are  not  generalizabte  to  another  length  test.  The  zero  value  for 
the  test  type  variance  component  indicated  that  test  type  did  not  contribute  to  the 
systematic  variance  in  test  scores;  thus,  it  made  little  difference  which  test  type  scores 
were  associated  with.  There  was  a  relatively  small  variance  component  associated 
with  the  tl  interaction,  indicating  that  test  length  affects  scores  differently  depending  on 
the  type  of  test,  but  not  so  as  to  be  a  major  source  of  variance.  No  measurable 
variance  component  was  associated  with  the  pi  interaction,  and  very  little  was 
associated  with  the  pt  interaction.  Thus,  the  results  of  the  generalizability  analysis 
indicated  that  test  scores  are  generalizabie  across  test  type,  not  supporting  Hypothesis 
Three. 


Table  9 

Estimated  Variance  Components  for  G-Studv  with  Two  Test  Types  and  Five  Item 

Sample  Sizes 


Effect 

df 

MS 

Person  (p) 

293 

80.39 

Test  type  (t) 

1 

(0.0) 

Test  length  (1) 

4 

16.12 

pt 

293 

1.84 

pi 

1172 

(0.0) 

tl 

4 

12.97 

ptI 

1172 

91.42 

li.294 

Content-Related  Validity 

Hypothesis  Four  held  that  strong  test  content  theory-based  test  construction 
would  produce  tests  with  evidence  of  better  content  validity  than  would  weak  test 
content  theory-based  test  construction.  Total  test  scores  were  used  as  the  best 
representation  of  the  true  domain  score.  To  investigate  how  closely  the  sample  teste 
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approximated  the  Total  Test  in  the  assessment  of  individuals,  Form  A  tests  of  both 
types  were  correlated  with  the  Total  Test.  As  shown  in  Table  10,  no  significant 
differences  (p,<.001)  were  found  between  the  TR  and  TJ  correlations  with  the  Total 
Test. 


Table  10 

Zero-Order  Correlations  of  Sample  Tests  with  Total  Test  Score 


Test  lenoth 

Test  Type 
TR 

TJ 

Hotellina'st 

100 

.94 

.93 

1.19 

50 

.88 

.86 

.85 

25 

.71 

.77 

-2.14 

12 

.67 

.66 

.18 

6 

.53 

.50 

.61 

Note.  B<-001  for  all  correlations;  lvalue  required  for  significance  at 

1K.001  »  3.30  for  differences  between  correlations . 

li«294 


As  was  to  be  expected,  correlations  decreased  as  test  length  decreased. 
Correlations  were  moderate  to  high  for  both  the  TR  tests  and  TJ  tests.  Correlations  of 
this  magnitude  were  not  unexpected,  as  these  are  part-whole  correlations.  These 
correlations  suggest  that  a  sample  of  100  items  from  the  Total  Test  gives  a  good 
representation  of  the  Total  Test  regardless  of  which  sampling  method  is  used.  These 
correlations  also  reflect  the  reduction  in  agreement  between  the  sample  tests  and  the 
Total  Test  as  the  number  of  items  decreases.  However,  even  with  as  few  as  6  items,  a 
moderate  correlation  between  sample  and  Total  Test  was  achieved  in  both  test  types. 
Based  on  th^  results  Hypothesis  Four  was  not  supported. 

Construct  Valttiatton 

In  order  to  investigate  the  construct  validity  of  the  sample  tests,  as  addressed  by 
Hypothesis  Five,  variables  hypothesized  to  be  related  to  an  individual's  level  of  job 
knowledge  were  identified  and  data  were  collected  on  those  variables.  Table  1 1  lists 
those  variables  and  descriptive  statistics  for  each.  An  intercorrelation  matrix  of  ail 
variables  used  in  the  construct  validation  study  is  given  in  Appendix  D. 

Table  12  shows  the  correlations  of  the  Total  Test  and  each  of  the  sample  tests 
with  self  ratings  of  job  knowledge  and  supervisor  ratings  of  job  knowledge,  it  was 
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expected  that  TJ  tests  would  have  higher  positive  correlations  with  the  self  and 
supervisors'  ratings  of  job  knowledge  than  would  the  TR  tests.  Correlations  ranged 
from  fairly  low  to  moderate.  All  correlations  were  significant  (c<.00l).  As  was  to  be 
expected,  correlations  decreased  in  magnitude  as  test  length  decreased.  However, 
no  significant  differences  (u<.001 )  were  found  between  test  types  in  correlations  with 
seif  or  supervisor's  ratings.  For  both  test  types,  test  correlations  were,  in  general, 
slightly  higher  with  self  ratings  than  with  supervisor  ratings. 


Table  1 1 

Means.  Standard  Deviations  and  Range  of  Construct  Validation  Measures 


Constmct  variable 

Mean 

SD 

Ranee 

N 

Job  knowledge  ratings-self 

3.53 

.61 

2.00-5.00 

293 

Job  knowledge  ratings- 

3.60 

.88 

1.00-5.00 

294 

supervisor 

ExperienceAraining  ratings 

3.44 

.66 

1.75-5.00 

292 

Technical  training  grade 

89.04 

4.74 

76.00-98.00 

79 

Job  performance  score 

143.55 

23.00 

97.01-222.79 

81 

Performance  rating- 

3.68 

.63 

2.10-4.87 

80 

supervisor 

Performance  rating-self 

3.75 

.61 

2.41-5.00 

80 

Performance  rating-peer 

3.66 

.47 

2.71-4.68 

80 

General  science 

53.45 

6.15 

39-66 

69 

Arithmetic  reasoning 

52.58 

6.17 

41-66 

69 

Work  knowiedge 

52.26 

4.86 

42-61 

69 

Paragraph  comprehension 

52.51 

5.90 

32-61 

69 

Numerical  operations 

51.61 

6.62 

35-62 

69 

Coding  speed 

51.42 

6.16 

35-64 

69 

Auto/shop  information 

58.12 

6.30 

44-69 

69 

Math  knowledge 

52.43 

7.40 

38-66 

69 

Mechanical  comprehension 

56.75 

5.59 

41-67 

69 

Electronic  information 

54.43 

8.07 

37-70 

69 

Verbal  composite 

52.49 

4.52 

42-62 

69 

Mechanical  composite 

226.43 

18.28 

175-265 

69 

Administrative  composite 

155.52 

11.80 

131-178 

69 

General  composite 

105.07 

8.16 

92-128 

69 

Electronic  composite 

212.90 

18.26 

184-253 

69 

Air  Force  Qualifying  Test 

78.80 

7.30 

67.0-98.5 

69 
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Table  12 


Correlations  of  Total  Domain  Test  and  Sample  Tests  with  Self  and  Supervisor's 

Ratings  of  jQb.KDOvyladoe 


i 

J 

TR 

Seif  ratings 

_TJ _ t  _ 

100 

.45 

.41 

1.42 

.40 

.39 

.31 

50 

.46 

.42 

.93 

.32 

.37 

-1.32 

25 

.38 

.36 

.32 

.29 

.33 

-.88 

12 

.30 

.34 

-.67 

.24 

.31 

-1.09 

6 

.29 

.30 

-.13 

.24 

.33 

-1.35 

Total  test 

.48 

.44 

Note.  fi<.001  for  all  correlations;  Lvalues  are  for  Hotelling's  1-tests  with  £11=291 . 
No  1  values  reached  significance  c<.001. 

£i«  293  for  correlations  with  Seif  Ratings; 

294  for  correlations  with  Supervisor  Ratings. 


A  conceptual  nuxlel  of  factors  hypothesized  to  be  related  to  job  knowledge  was 
developed.  The  model  specified  variables  thought  to  be  associated  with  job 
knowledge.  The  model  was  analyzed  via  hierarchical  regression  analysis.  Variables 
were  entered  into  the  regression  equation  in  the  order  of  the  hypothesized  strength  of 
association,  as  shown  in  Table  13.  Those  variables  thought  to  be  most  highly 
associated  were  entered  into  the  equation  first.  The  relationship  of  each  sample  test 
and  the  Total  Test  to  these  variables  was  analyzed  using  this  model.  Scores  on  the 
Mechanical  Composite  of  the  ASVAB  were  used  to  represent  aptitude.  The  final  ^  for 
each  model  was  computed  to  assess  the  association  of  each  test  score  with  a 
weighted  linear  compose  of  the  construct  validation  model  variables.  In  the  context 
of  Hypothesis  Five,  it  was  expected  that  the  TJ  tests  would  show  a  stronger  association 
with  the  construct  variables,  i.e..  a  higher  squared  multiple  correlation  coefficient  (Q^) 
for  the  TJ  regression  equations. 

As  shown  in  Table  13,  only  the  Total  Test  and  the  TJ100  test  showed  significant 
associations  with  the  weighted  linear  composite  of  the  construct  variables  used  in  this 
analysfo,  as  incficated  by  the  squared  multiple  correlation  coefficients.  No  difference  in 
pattern  of  association,  i.e.,  B  at  each  step,  was  apparent  between  the  test  types.  It  can 
be  noted  that  the  squared  multiple  correlation  coefficient  for  each  TJ  test  equation  is 
greater  in  magnitude  than  for  the  same  length  TR  test  equation.  However,  because 
the  differences  between  the  TR  and  TJ  tests,  in  their  association  with  the  construct 
variables,  are  so  small,  the  results  of  this  analysis  do  not  suggest  any  meaningful 
differsnces  between  test  types. 
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Table  13 


Hierarchical  Regression  Results  for  Total  Test  and  Sample  Tests 


Variables 
(listed  in 
order  of 
entry  into 

Total 

TR 

TJ 

regression 

equation) 

test 

100 

-50- 

25 

_12_ 

m. 

25 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

1.  Training 

.23 

.24 

.18 

.21 

.11 

.05 

.26 

.22 

.06 

.11 

.06 

2.  Aptitude 

.48 

.48 

.36 

.32 

.28 

.11 

.52 

.40 

.33 

.31 

.21 

3.  Experience/ 

training 

.48 

.48 

.39 

.32 

.28 

.15 

.52 

.44 

.34 

.31 

.28 

4.  Job  performance 
score 

.53 

.52 

.43 

.35 

.31 

.19 

.55 

.46 

.40 

.43 

.28 

5.  Supervisor's  .54 

performance 
rating 

.52 

.44 

.38 

.31 

.19 

.55 

.47 

.43 

.44 

.33 

6.  Seif  performance 
rating 

.58 

.54 

.48 

.38 

.36 

.26 

.57 

.49 

.50 

.45 

.36 

7.  Peer  performance 
rating 

.60 

.56 

.49 

.40 

.38 

.26 

.60 

.51 

.50 

.49 

.37 

R 

.36* 

.31 

.24 

.16 

.14 

.07 

.36* 

.26 

.25 

.24 

.14 

£i-69 


Also,  to  assess  the  relative  construct  validity  of  the  tests,  test  scores  were 
correlated  with  aptitude  indicators,  as  shown  in  Table  14.  It  was  expected  that  the 
Total  Test  and  sample  tests  would  correlate  with  the  mechanical  aptitude  indicators 
and  not  correlate  with  tests  of  other  aptitude  areas.  Significant  correlations  (a<.001) 
were  seen  for  the  Total  Test  and  the  100-item  tests  of  both  types  with  the  mechanical 
aptitude  indicators.  TJ100  correlations  with  mechanical  aptitude  indicators  were 
slightly  higher  than  the  TR100  correlations  with  mechanical  aptitude  indicators. 
However,  these  (Terences  were  not  significant  (p<.001 ).  Shorter  len^h  tests  of  either 
type  did  not  correlate  at  a  significant  level  with  any  of  the  mechanical  indices.  With  the 
exception  of  a  significant  correlation  between  TR12  and  coding  speed,  no  significant 
correlations  were  seen  with  aptitude  indicators  not  related  to  mechanical  aptitude. 
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As  a  final  investigation  of  construct  validity,  the  sensitivity  of  the  Total  Test  and 
sample  tests  to  detect  differences  between  novices  and  masters  in  test  performance 
was  assessed.  It  was  expected  that  the  TJ  tests  would  be  more  sensitive  to  differences 
between  novices  and  masters  in  job  knowledge.  As  shown  in  Table  15,  there  were 
significant  differences  between  novices  and  masters  on  all  tests.  Longer  tests 
appeared  to  be  more  effective  in  assessing  differences  than  shorter  tests  for  both  test 
types.  It  was  noted  that  the  differences  for  the  TJ  test  were  slightly  greater,  overall, 
than  those  for  the  TR  tests,  suggesting  that  the  TJ  tests  may  be  more  sensitive  to 
differences  between  masters  and  novices. 


Table  14 

CorrelatiQns  of  Total  Domain  Test  and  Sample  Tests  with  Aptitude  Indicators 


Aptitude 

Total 

test 

m 

50 

TR 

25 

12 

6 

50 

TJ 

25 

12 

6 

General  science 

.16 

.14 

.06 

.16 

-.01 

.06 

.15 

.06 

.10 

.16 

-.16 

Arithmetic  reasoning 

.15 

.13 

.13 

.12 

.16 

-.01 

.11 

.02 

.16 

-.03 

.14 

Word  kno¥vledge 

.18 

.19 

.07 

.10 

.02 

.13 

23 

.00 

.05 

-.02 

-.06 

Paragraph 

27 

.28 

20 

.11 

24 

.18 

23 

.21 

.11 

.05 

.03 

comprehension 

Numerical  cperaHons 

.02 

.10 

.17 

.06 

20 

-.12 

.04 

.11 

-.01 

.10 

28 

Coding  speed 

2A 

.34 

.32 

.16 

.42* 

.11 

24 

28 

.14 

20 

20 

Auto/shop 

.45* 

.43i 

25 

26 

21 

.16 

.456 

.38 

25 

26 

23 

Information 

Math  knowledge 

.12 

.11 

.13 

23 

.07 

.02 

.11 

-.02 

.05 

.10 

.13 

Mechanical 

.39* 

.406 

24 

26 

21 

-.02 

.466 

29 

22 

.34 

24 

comprehension 

Electronic 

.28 

.30 

24 

21 

.03 

.16 

.35 

.17 

.19 

22 

.16 

information 

Verbal  composite 

.24 

26 

.13 

.12 

.11 

.17 

26 

.09 

.07 

.01 

-.03 

Mechanical  composite 

.48* 

.476 

26 

21 

24 

.12 

.506 

27 

24 

24 

.18 

Administrative 

.23 

23 

21 

.16 

27 

.06 

24 

25 

.09 

22 

20 

composite 

General  composite 

.24 

24 

.17 

.15 

.18 

.09 

22 

.06 

.16 

-.02 

.09 

Electronic  composite 

27 

27 

22 

22 

.09 

.09 

29 

.09 

.19 

.19 

.12 

Armed  Forces 

22 

21 

26 

.18 

27 

.07 

27 

.13 

.16 

.05 

20 

QuaRfyingTest 


Note.  Mechanical  Aptitude  Indices  are  underlined. 

Correlations  having  the  same  subscript  were  compared  and  are  not  significantly  different  at 
(KOOI. 

•ft<.001 
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Table  15 


Mean  Scores  and  Standard  Deviations  of  Novices  and  Masters  on  Total  Test  and 
Sample  Tests 


Novices  Ldasteis 


t 

Total 

66.69 

8.00 

79.60 

6.75 

-10.00 

TR100 

69.74 

8.09 

81.74 

7.93 

-8.83 

TR50 

70.49 

10.32 

84.04 

8.02 

-8.29 

TR25 

63.71 

9.20 

75.06 

6.86 

-7.86 

TR12 

60.81 

15.02 

72.88 

14.89 

-4.76' 

TR6 

77.78 

19.89 

90.52 

13.02 

-4.18 

TJ100 

66.31 

8.25 

79.04 

7.63 

-9.33’ 

TJ50 

66.32 

8.95 

80.00 

8.41 

-9.21’ 

TJ25 

69.62 

10.49 

82.74 

9.24 

-7.67’ 

TJ12 

65.62 

15.33 

82.03 

12.06 

-6.74 

TJ6 

74.47 

17.38 

92.16 

10.73 

-6.70’ 

Note.  Novices«1-24  months  in  career  field  (H-l  1 1  )•  Masters  84  months  or 
more  in  career  field  (£i»51). 

*a<.001 


Overall,  the  analyses  to  investigate  the  construct  validity  of  the  sample  tests 
provided  very  little  support  for  the  fifth  hypothesis.  The  comparison  of  the  differences 
between  means  for  masters  and  novices  means  on  the  two  test  types  suggested  that 
the  TJ  tests  might  be  more  sensitive  to  differences  between  those  two  groups. 
However,  all  the  novices-masters  differences  were  significant  for  both  test  types,  and 
the  difference  between  test  types  was  small.  Also,  the  regression  anviiysis  suggested 
that  the  TJ  tests  had  a  slightly  stronger  association  with  the  set  of  construct  validation 
variables.  Again,  because  these  differences  were  so  slight,  little  meaning  can  be 
attached  to  them. 

The  last  hypothesis.  Hypothesis  Six,  proposed  that  test  type  differences  in 
reliability  of  test  specifications,  content  validation,  and  construct  validation  would 
increase  as  test  length  decreased.  As  no  meaningful  differences  between  test  types 
were  found  in  any  of  the  areas  investigated.  Hypothesis  Six  could  not  be  supported. 
However,  ^  would  be  expected,  overall  reductions  in  indices  of  test  quality  across  the 
three  areas  of  investigation  were  seen  as  test  length  decreased. 


CHAPTER  VI 


DISCUSSION 

This  study  was  intended  to  address  issues  that  arise  when  a  test  developer  is 
required  to  construct  a  test  that  cannot  completely  cover  the  content  domain  due  to  test 
administration  constraints,  such  as  testing  time.  In  this  common  situation,  the  test 
developer  must  rely  on  a  sample  from  the  content  domain  to  represent  the  total  content 
domain.  Underlying  the  sampling  of  the  content  domain  is  a  test  content  theory,  either 
implidtiy  or  expiidtiy  defined.  An  implicit  test  content  theory  can  be  seen  as  a  weak 
test  content  theory,  in  which  the  underlying  test  content  theory  is  not  given  much 
emphasis.  An  expiidtiy  defined  theory  can  be  categorized  as  a  strong  test  content 
theory.  The  question  arises  as  to  how  well  the  sample,  selected  as  a  result  of  the 
underlying  theory,  represents  the  content  domain. 

Six  hypotheses  were  investigated  in  this  study.  The  proposition  underlying 
each  of  the  hypotheses  was  that  strong  test  content  theory  provides  better  definition 
and  structure  to  test  development  and  that  this  structure  and  definition  is  needed  to 
produce  a  quality  test.  Thus,  it  was  hypothesized  that  strong  test  content  theory,  as 
reflected  in  the  test  specifications,  would  produce  a  higher  quality  test  than  one 
developed  using  a  weak  test  content  theory. 

The  hypotheses  investigated  in  this  study  dealt  with  three  general  areas. 
Hypotheses  One,  Two  and  Three  dealt  with  the  reliability  of  test  specifications. 
Hypothesis  Four  dealt  with  content  validity,  and  Hypothesis  Five  dealt  with  construct 
v^idity.  Hypothesis  Six  deaH  with  the  interaction  of  test  length  with  the  effects  of  test 
content  theory  across  the  three  general  areas. 

Hypothesis  One  posited  that  alternate  forms  of  a  test  developed  using  strong 
test  content  theory  would  be  more  comparable  in  terms  of  internal  psychometric 
properties  than  would  alternate  forms  of  a  test  developed  using  weak  test  content 
theory.  This  hypothesis  was  not  supported.  No  systematic  differences  were  seen  in 
the  comparability  between  alternate  forms  of  tests  developed  using  weak  test  content 
theory  (WTCT)  versus  tests  developed  »xx>rcfing  to  strong  test  content  theory  (STCT). 

Hypothesis  Two  held  that  alternate  forms  of  STCT  tests  would  be  more  nearly 
ec^Jivalent  than  alternate  forms  of  WI  CT  tests.  The  analysis  to  investigate  this  was  an 
attempt  to  approximate,  for  each  test  type,  the  "duplicate  experiment"  called  for  by 
Cronbach  (1971)  to  investigate  rigorously  the  match  between  the  operational 
definition  used  to  construct  the  test  and  the  actual  test  operations,  and  to  compare  the 
results  for  the  two  test  types.  No  difference  between  test  types  was  seen  in 
correlations  of  equivalence.  Therefore,  this  hypothesis  was  not  supported.  A  large 
variation  in  the  degree  of  equivalence  between  test  forms  was  seen,  with  correlations 
of  equivalence  ranging  from  .28  to.90  for  the  weak  test  content  theory  tests  and  from 
.27  to  .87  for  the  strong  test  content  theory  tests.  The  degree  of  equivalence  seen 
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appeared  to  be  a  function  of  test  length,  with  the  highest  degree  of  equivalence  seen 
between  the  100-item  tests. 

It  is  of  interest  to  note  that  Ebel's  (1962)  comparison  of  test  forms  developed  by 
two  independent  test  developers,  each  using  the  same  set  of  specifications,  found  a 
similar  level  of  intercorrelation  between  test  forms  (£=.92)  as  observed  in  this  study 
(£=.90  for  TR100  and  i=.87  for  TJ100).  The  content  domain  of  Ebel's  test  was  word 
knowledge.  The  test  specifications  reflected  a  spaced  sampling  of  100  words  from  a 
specified  dictionary.  Ebel  concluded  that,  based  on  the  sampling  fluctuations  seen  in 
the  scores,  tests  of  many  more  than  100  items  would  be  needed  to  yield  equivalent 
scores  from  alternate  forms.  Thus,  it  would  appear  that  larger  samples  of  the  domain 
or  more  precise  test  specifications  are  needed  to  assure  equivalent  forms  when  item 
selection  is  based  on  test  specifications  alone,  without  the  use  of  item  analysis 
information. 

Hypothesis  Three  stated  that  test  scores  are  not  generalizable  across  test  type 
(WTCT  versus  STCT)  in  the  context  of  generalizability  theory.  Generalizability  theory 
analysis  found  that  no  systematic  variants  in  scores  was  associated  with  test  type. 
Thus,  this  hypothesis  was  not  supported. 

Hypothesis  Four  proposed  that  STCT  tests  would  exhibit  evidence  of  better  test 
content  validity  than  would  WTCT  tests.  No  differences  between  test  types  were  seen 
in  correlations  with  the  Comprehensive  Job  Knowledge  Test  scores.  Therefore. 
Hypothesis  Four  was  not  supported. 

Hypothesis  Five  proposed  that  STCT  theory  tests  would  exhibit  stronger 
evidence  of  construct  validity  than  wouid  WTCT  tests.  A  series  of  analyses  comparing 
evidence  of  the  construct  validity  of  tests  developed  by  using  a  strong  test  content 
theory  and  tests  developed  by  using  a  weak  test  content  theory  provided  no 
meaningful  support  for  this  hypothesis.  The  STCT  tests  appeared  to  be  somewhat 
more  sensitive  to  differences  b^een  masters  and  novices,  and  slight,  but  consistent, 
(ifforences  between  test  types  were  seen  in  the  association  of  test  scores  with  the 
construct  validation  variables;  however,  the  magnitudes  of  the  differences  were  not 
large  enough  to  be  considered  meaningful. 

Finally,  Hypothesis  Six  proposed  that  differences  in  test  quaiity  between  STCT 
tests  and  WTCT  tests  would  increase  as  test  content  sample  size  (i.e.,  number  of  test 
items)  decreased.  As  no  meaningful  differences  were  found  between  test  types,  this 
hypothesis  was  not  supported.  As  would  be  expected,  both  test  reliability  and  validity 
decreased  as  test  iength  decreased  across  both  test  types;  this  general  effect  has 
been  well-documented  in  both  theory  and  previous  research. 

Thus,  the  hypothesized  differences  in  test  quality  between  STCT  tests  and 
WTCT  tests  were  not  found  in  this  study.  However,  as  this  study  is  the  only  study  to 
date  making  this  comparison  and  only  two  sets  of  test  specifications  were  compared,  it 
would  be  premature  to  conclude  at  this  time  that  there  are  no  differences  in  the 
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characteristics  of  tests  developed  using  a  weak  test  content  theory  versus  and  strong 
test  content  theory.  As  noted  earlier,  little  empirical  research  has  been  conducted  in 
this  area.  Thus,  there  is  no  body  of  literature  into  which  to  integrate  these  results  and 
with  which  to  make  comparisons.  In  assessing  the  meaning  of  these  results  of  this 
study,  several  issues  should  be  considered. 

First,  the  test  specifications  that  represented  strong  test  content  theory  in  this 
study  were  not  dramatically  different  from  what  would  be  expected  over  repeated 
random  sampling.  This  was  not  unexpected  as  the  strong  test  content  theory  test 
specifications  were  designed  to  reflect  the  salient  features  of  the  content  domain. 
Although  differences  were  seen  in  emphasis  on  certain  areas  of  the  content  domain, 
the  differences  were  not  dramatic.  Thus,  an  alternate  conclusion  might  be  that  test 
results  are  not  sensitive  to  slight-to-moderate  variations  in  test  specifications  that  result 
from  different  test  construction  theories.  No  conclusions  can  be  drawn  as  to  the  impact 
of  dramatically  different  test  specifications,  as  might  be  appropriate  in  other  domains  or 
for  teste  for  other  purposes. 

Also,  it  should  be  noted  that  the  specification  of  the  content  domain  underlying 
the  test  development  for  both  types  of  tests  was  very  thorough  and  subject  to  the 
judgment  of  subject  matter  experts  as  to  what  was  included.  Thus,  although  efforts 
were  taken  to  indude  all  elements  of  the  content  domain,  there  was  little  included  in 
the  content  domain,  as  it  was  defined,  that  could  be  considered  trivial  or  irrelevant. 
Therefore,  it  could  be  concluded  that,  given  a  well-defined  test  content  domain, 
relatively  small  differences  in  test  construction  specifications  have  no  significant 
impact  on  the  resulting  test  score  characteristics. 

Rnally,  the  importance  of  test  length  should  be  noted.  The  1 00-item  test  of  both 
types  exhibited  reiiabiiity  of  test  spedfications,  content  validity  and  construct  validity 
very  dose  to  that  of  the  Total  Comprehendve  Job  Knowledge  Test.  Decreases  in 
overall  test  quality  were  seen  as  test  length  was  reduced.  Of  course,  test  length 
dedsions  should  be  made  in  the  context  of  the  breadth  of  the  content  domain  being 
sampled  and  intended  use  of  the  test  scores. 

The  results  of  this  study  point  to  the  need  for  further  research  in  this  area 
beyond  the  scope  of  this  investigation.  Alternate  sampling  plans  emphasizing  other 
relevant  features  of  the  content  domain  should  be  investigate.  It  is  possible  that  the 
content  theory  developed  for  use  in  this  study  was  not  the  best  one  to  represent  the 
content  domain.  There  may  exist  important  features  of  the  content  domain  that  were 
not  emphasized  in  the  strong  test  content  theory  strategy.  Also,  the  overlap  in  job 
knowledges  across  content  areas  that  was  seen  in  this  study  suggests  that  research 
into  the  usefulness  of  general  knowledge  testing  would  be  of  interest.  Perhaps  atest 
that  focuses  on  the  common  knowledges  would  provide  information  more  relevant  to 
job  performance  than  would  a  test  that  samples  a  broader  range  of  job  knowledge. 
This,  again,  would  reflect  an  alternate  theory  of  job  knowledge. 
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Different  and  more  precise  approaches  to  test  content  specification,  such  as 
facet  theory,  should  be  investigated.  The  best  match  between  test  construction 
approach  and  domain  characteristics  needs  to  be  investigated  and  established. 

Also  beyond  the  scope  of  this  study,  but  worthy  of  investigation,  is  the  issue  of 
the  impact  of  underlying  test  theory  on  mastery  decisions,  such  as  those  required  in 
certification  tests.  The  differences  seen  in  scores  of  novices  and  masters  on  the  STCT 
tests  compared  to  the  WTCT  tests  suggest  that  this  is  an  area  where  test  content  theory 
may  play  a  stronger  role.  Again,  the  intended  use  of  the  test  scores  should  be  a 
determining  factor  in  the  approach  taken  to  the  construction  of  a  test. 

The  iterative  nature  of  test  construction  has  been  emphasized  by  Millman  and 
Greene  (1989).  The  results  of  this  study  should  be  seen  in  the  light  of  this  idea. 
Although  test  items  were  pretested  and  revised,  the  test  specifications  were  not.  The 
iterative  nature  and  steps  for  test  refinement  are  well  documented  for  normative  test 
construction.  However,  it  may  be  that  test  specifications  should  be  subject  to  pretest 
and  refinement  as  well. 

The  aim  of  this  study  was  to  investigate  issues  that  have  practical  applications 
in  test  development.  The  results  of  this  study  suggest  that,  given  a  well  defined 
domain  and  careful  item  development,  differences  in  test  content  theory  such  as  those 
seen  in  this  study  may  not  result  in  test  scores  with  significant  differences  in 
psychometric  properties.  Adequate  test  length  is  required  if  the  measurement 
instrument  is  to  demonstrate  reliability  and  validity.  However,  results  of  this  study 
should  be  interpreted  cautiously  and  should  not  be  generalized  to  domains 
significantiy  different  from  the  one  used  in  this  study.  It  should  be  remembered  that  the 
content  domain  used  was  fairly  homogeneous,  was  of  low-to-moderate  difficulty  level, 
and  contained  no  special  requirements,  such  as  safety  certification.  Other  domains 
may  require  testing  more  suited  to  the  characteristics  of  the  specific  domain. 
Additionally,  although  slight  differences  in  test  specifications  may  have  little  impact  on 
the  test  results,  it  is  still  the  responsibility  of  the  test  developer  to  consider  the  issue  of 
test  content  theory  in  test  development  and  to  have  a  defensible  rationale  for  the 
approach  taken. 

Concerns  about  the  quality  of  measurement  instruments  are  not  confined  to 
those  in  the  testing  field.  A  recent  national  news  article  (Leslie  &  Wingert,  1990)  cited 
the  question  *How  do  you  measure  success  -  against  what  test?”  as  the  question  for 
the  1990s  in  education.  Discussing  the  role  of  testing  in  the  American  educational 
system,  the  authors  concluded  that  we  need  new  tests  to  help  us  produce  students 
who  know  how  to  think.  Parents,  politicians,  and  employers  were  cited  as  sources  of 
the  push  for  tests  that  measure  the  right  skills  and  supplement,  rather  than  distort, 
classroom  instruction.  The  trend  toward  standardized  performance-based  testing  that 
indudes  real-life  tasks  and  makes  use  of  essay  questions  was  dted.  These  issues 
and  trends  make  the  issue  of  test  development  methods,  in  general,  and  test  content 
theory,  specifically,  all  the  more  relevant  and  the  need  for  continued  research  more 
critical. 
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APPENDIX  A 

FIRST’TERM  AEROSPACE  GROUND  EQUIPMENT  (AGE) 
GENERAL  MECHANIC  TASKS  AND  ASSOCIATED  JOB  KNOWLEDGES 
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SELECTED  TASKS  WITH  ASSOCIATED  ITEM  NUMBERS  AND 
CORRESPONDING  JOB  KNOWLEDGES 


Task  154  Perform  aircraft  support  generator  visual  or  service  Inspection^ 


ITEM  NUMBER  KNOWLEDGE  MEASURED 

62.  Procedure  for  checking  equipment  forms. 

110.  Method  for  checking  fuel  level. 

120.  Inspection  of  the  air  Inlet  screen. 

136.  Method  for  checking  the  emergency  shut  down  lever. 

IBS.  Service  Inspection  on  an  A/M32A-B6  generator. 

186.  Method  for  reading  the  service  Indicator. 

213.  Inspection  of  cables  on  a  generator  set. 

245.  Procedure  for  checking  the  parking  b»*ake. 

252.  Interpretation  of  an  oil  dip  stick  reading. 

263.  Deflection  allowed  In  drive  belts. 

302.  Identification  of  gages  needing  lemedlate  replacement. 
340.  Procedure  for  checking  protective  tray  lanps. 


Task  173  Perform  aircraft  support  generator  periodic  Inspections 


ITEM  NUMBER  KNOWLEDGE  MEASURED 

12.  Identify  the  draw  bar  or  tow  bar. 

S4.  Procedure  for  packing  wheel  bearings. 

62.  Procedure  for  checking  equipment  forms. 

109.  Required  times  for  disconnecting  battery. 

116.  Inspection  of  a  CMitrol  panel. 

120.  Inspection  of  the  air  Inlet  screen. 

129.  Periodic  Inspection  on  an  engine  crank  case. 

145.  Inspection  of  the  governor  accuator  linkage. 

155.  Periodic  Inspection  on  fuel  lines. 

1SB.  Method  for  straightening  bent  doors. 

177.  Periodic  Inspection  on  the  external  power  receptacles 
of  an  aircraft  support  generator. 

186.  Method  for  reading  the  service  Indicator. 

207.  Procedure  for  removal  of  a  fan  belt. 

208.  Service  Inspection  of  a  gas  turbine  compressor. 

213.  Inspection  of  cables  on  a  generator  set. 

214.  Identify  the  AFTO  number  for  the  Equipment  Status  form. 

221.  Procedure  for  cleaning  bearings. 

244.  Drying  and  Inspection  procedures  for  bearings. 

245.  Procedure  for  checking  the  parking  brake. 

278.  Periodic  Inspection  of  coolant  hoses,  lines,  and 
fittings. 

340.  Procedure  for  checking  protective  tray  lamps. 

354.  Reason  for  checking  the  butterfly  valve  of  the 
generator. 


Task  444  Build  bleed  air  hoses 


ITEM  NUMBER 
34. 

109. 

174. 

2B1. 

29S. 

KN0WLE06E  MEASURED 

Use  for  various  lockwiring  (safety  wiring)  methods. 
Required  times  for  disconnecting  battery. 

Position  of  clamps  on  a  bleed  air  hose. 

*'ef1ne  scuff  cover. 

V..'  of  a  torque  wrench. 

Task  247  Adjust  augneto  or  distributor  points 

ITEM  NUMBER 
183. 

217. 

KNOWLEDBE  MEASURED 

Location  of  breaker  points. 

Identify  components  Inside  of  an  Ignition  breaker 
assembly. 

Task  259  Clean  aagneto  or  distributor  points 

ITCH  NIMBER 
17. 

93. 

20S. 

217. 

233. 

291. 

300. 

3SB. 

369. 


KNOULEOBE  MEASURED 
Procedure  for  cleaning  contactors. 

Define  PSI. 

Appropriate  use  of  coa4>ressed  air. 

Identify  coag>onents  Inside  of  an  Ignition  breaker 
asseobly. 

Hhat  PSI  syadMlIzes. 

Define  what  a  wgneto  supplies. 

Requireaents  for  wearing  protective  gear. 
Procedure  for  cleaning  contactor  points. 

Procedure  ^or  cleaning  aagneto. 


Task  330  Tine  distributors 


ITEM  NUNBER 
123. 
132. 
141. 
161. 
217. 

223. 

226. 

283. 

305. 

368. 


KN0WLE06E  MEASURED 
Define  firing  order. 

Methods  for  checking  Ignition  tining. 

Define  nunber  1  cylinder. 

Point  gap  for  an  NF-2  light  cart. 

Identify  conponents  Inside  of  an  Ignition  breaker 
assenbly. 

Define  gap. 

Define  top  dead  center. 

Define  conpresslon  stroke. 

Use  of  a  stroboscope. 

Meaning  of  Inpulse  coupling  snapping. 


Task  200  Clean  load  contactors 


ITEM  NUMBER 
17. 
109. 
356. 


KN0HLE06E  MEASURED 
Procedure  for  cleaning  contactors. 
Required  tines  for  disconnecting  battery. 
Procedure  for  cleaning  contactor  points. 


Task  197  Clean  contactor  points 


ITEM  NUNBER 
17. 
109. 
356. 


KN0M.E06E  MEASURED 
Procedure  for  cleaning  contactors. 
Required  tines  for  disconnecting  battery. 
Procedure  for  cleaning  contactor  points. 


Task  290  Renove  or  Install  engine  Magnetos  or  distributors 


ITEM  NUMBER  KNOWLEDGE  MEASURED 

SB.  Identify  the  tool  used  for  removal  of  magneto  Ignition 
leads. 

132.  Methods  for  checking  Ignition  timing. 

216.  When  grounding  a  unit  Is  necessary. 

217.  Identify  components  Inside  of  an  Ignition  breaker 

assembly. 

226.  Define  top  dead  center. 

291.  Define  what  a  magneto  supplies. 


331  Time  engine  magnetos 

ITEM  NUMBER 

KNOWLEDGE  MEASURED 

84. 

Procedure  for  setting  engine  timing  with  the  flywheel. 

123. 

Define  firing  order. 

132. 

Methods  for  checking  Ignition  timing. 

141. 

Define  number  1  cylinder. 

144. 

Identify  meter  which  measures  current  flow. 

216. 

When  grounding  a  unit  Is  necessary. 

226. 

Define  top  dead  center. 

247. 

Identify  the  crankshaft. 

283. 

Define  compression  stroke 

286. 

Location  of  the  flywheel. 

291. 

Define  what  a  magneto  supplies. 

305. 

Use  of  a  stnMwscope. 

368. 

Meaning  of  Impulse  coupling  snapping. 
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Task  190  Adjust  contactor  points 


ITEM  NUMBER 
60. 

109. 

226. 

348. 

KNOWLEDGE  MEASURED 

Use  of  feeler  gage. 

Required  tines  for  disconnecting  battery. 

Define  top  dead  center. 

Procedure  for  adjusting  breaker  points. 

Task  257  Clean  and  adjust  spark  plugs 

ITEM  NUMBER 
60. 

157. 

223. 

259. 

341. 

362. 

KNOWLEDGE  MEASURED 

Use  of  feeler  gauge. 

Procedure  for  checking  spark  plug  gap  on  an  NF'2  light 
cart. 

Define  gap. 

Inspection  of  spark  plugs  for  Irregularities. 

ProbleoK  caused  by  carbon  build-up. 

Use  of  a  wire  brush. 

Task  311  Reaove  or  Install  spark  plugs 

ITEM  NUMBER 
60. 

109. 

157. 

216. 

259. 

295. 

363. 

KNOWLEDGE  MEASURED 

Use  of  feeler  gage. 

Required  tines  for  disconnecting  battery. 

Procedure  for  checking  spark  plug  gap  on  an  NF-2  light 
cart. 

When  grounding  a  unit  Is  necessary. 

Inspection  of  spark  plugs  for  Irregularities. 

Use  of  a  torque  wrench. 

Use  of  a  wire  brush. 

Task  303  Reanva  or  Install  Ignition  colls 

ITEM  NUMBER 
109. 

272. 

KNOWLEDGE  MEASURED 

Required  tines  for  disconnecting  battery. 

Define  Ignition  coll. 

Task  4B9  RtiMve  or  Install  batterlos 

ITEM  NUMBER 
30. 

314. 

KNOWLEDGE  MEASURED 

Tine  when  wearing  a  respirator  Is  required. 

Procedure  for  renoving  the  battery. 
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Task  502  Replace  tow  bar  springs 


ITEH  NUMBER  KNOHlEOeE  MEASURED 

12.  Identify  the  draw  bar  or  tow  bar. 


Task  488  Remove  or  Install  A6E  tire,  tube,  or  wheel  assemblies  * 


ITEM  NUMBER  KNOWLEDGE  MEASURED 

1.  Procedures  for  preventing  corrosion. 

11.  Identify  the  wheel  spindle. 

18.  Procedure  which  facilitates  handling  of  the  tire  when 
changing  the  Inner  tube. 

S3.  What  wheel  halves  are  prepared  with  before  assembly. 

64.  Method  for  preparation  of  tire  before  reassembly  of  a 

wheel. 

147.  Use  of  a  valve  core. 

148.  Preparation  of  wheel  halves  before  reassembly. 

237.  Location  for  positioning  Jack  stands. 

266.  Method  of  Jack1t>g  a  unit. 

277.  Procedure  for  full  Inflation  of  a  tire. 

323.  When  corrosion  prevention  methods  are  used. 


Task  477  Pack  wheel  bearings  * 


ITEM  NUMBER  KNOWLEDGE  MEASURED 

7.  Identify  the  grease  cap  (or  hub  cap). 

11.  Identify  the  wheel  spindle. 

13.  Identify  the  hub. 

14.  Identify  the  cotter  pin. 

15.  Identify  the  wheel  retaining  nut. 

37.  Preparation  of  components  for  Installation  of  the 
Inner  bearing  onto  the  hub. 

45.  Procedure  for  securing  the  outer  bearing  of  the  wheel 
assembly. 

54.  Procedure  for  packing  wheel  bearings. 

221.  Procedure  for  cleaning  bearings. 

234.  Removal  of  the  Inner  bearing  of  a  sp11t>ha1f  rim  tire. 
237.  Location  of  positioning  Jack  stands. 

244.  Drying  and  Inspection  procedures  for  bearings. 

266.  Method  of  Jacking  a  unit. 

300.  Requirements  for  wearing  safety  gear. 

327.  Location  of  the  grease  cap. 

338.  Procedure  for  removal  of  Inner  bearing. 
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Task  473  Adjust  brake  systems 


ITEM  NUMBER 
6. 

19. 

25. 

172. 

184. 

237. 

266. 


KNOWLEDGE  MEASURED 
Identify  the  king  pin. 

Procedure  for  making  adjustments  to  brake  application. 
Identify  brake-shoe  lining. 

Purpose  of  the  brake  lever  knob. 

Conclusions  drawn  when  brakes  can  no  longer  be 
adjusted. 

Location  for  positioning  Jack  stands. 

Method  of  Jacking  a  unit. 


Task  479  Perform  brake  system  operational  checks 

ITEM  NUMBER 

KNOWLEDGE  MEASURED 

245. 

Procedure  for  checking  parking  brake. 

Task  485  Remove  or  Install  AGE  brake  assemblies 

ITEM  NUMBER 

KNOWLEDGE  MEASURED 

6. 

Identify  the  king  pin. 

23. 

Identify  the  brake  lever. 

24. 

Identify  the  dust  cover/adjustment  cover. 

25. 

Identify  brake-shoe  lining. 

26. 

Identify  the  backing  plate. 

121. 

Inspection  of  a  brake  assembly  cam  shaft. 

138. 

Procedure  for  removal  of  glazed  spots  from  brake  shoe 
lining. 

172. 

Purpose  of  the  brake  lever  knob. 

...  ....  .  .  ...  ...  1 

Task  486  Remove  or  Install  AGE  brake  assembly  components 

ITEM  NUMBER 

KNOWLEDGE  MEASURED 

6. 

Identify  the  king  pin. 

23. 

Identify  the  brake  lever. 

24. 

Identify  the  dust  cover/adjustment  cover. 

25. 

Identify  brake-shoe  and  lining. 

26. 

Identify  the  backing  plate. 

121. 

Inspection  of  a  brake  assembly  cam  shaft. 

138. 

Procedure  for  removing  glazed  spots  from  brake  shoe 
lining. 

172. 

Purpose  of  the  brake  lever  knob. 
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ITCH  NUMBER  KNOWLEDGE  MEASURED 

20.  Identify  location  of  starter. 

22.  Periodic  Inspection  procedures  for  a  radiator. 

64.  Method  for  preparation  of  tire  before  reassenbly  of  a 
wheel. 

111.  Caution  taken  with  the  radiator  fan. 

136.  Method  for  checking  the  energency  shut  down  lever. 
170.  Define  solenoid. 

197.  Procedure  for  grounding  a  unit. 

254.  Location  of  the  overspeed  governor. 

287.  Periodic  Inspection  for  an  oil  filter  of  a  generator. 
300.  Requirements  for  wearing  protective  gear. 

320.  Number  of  threads  that  should  protrude  past  a  nut. 


Task  270  Perform  engine,  motor,  or  generator  operational  checks 


ITEM  NUMBER  KNOWLEDGE  MEASURED 

111.  Caution  taken  with  the  radiator  fan. 

116.  Inspection  of  e  control  panel. 

225.  Service  Inspection  of  the  manifold  block  on  a 
hydraulic  test  stand. 

230.  Operational  Inspection  of  lights. 

340.  Procedure  for  checking  protective  tray  lamps. 
343.  Inspection  of  meters  and  gages. 


Task  264  Isolate  engine,  motor,  or  generator  mechanical  malfunctions  * 


ITEM  NUMBER  KNOWLEDGE  MEASURED 

60.  Use  of  feeler  gauge. 

132.  Methods  for  checking  Ignition  timing. 

137.  Troubleshooting  techniques  when  engine  will  not  start 
when  cranked. 

144.  Identify  meter  which  measures  current  flow. 

157.  Procedure  for  checking  spark  plug  gap  on  an  NF-2  light 
cart. 

159.  Identify  location  of  the  carburetor. 

161.  Point  gap  for  an  NF>2  light  cart. 

168.  Condensor  should  be  changed  out  whenever  points  are 

changed. 

179.  Causes  of  a  Packette  engine  backfiring. 

252.  Interpretation  of  an  oil  dip  stick  reading. 

259.  Inspection  of  spark  plugs  for  Irregularities. 

260.  Adjustment  of  the  carburetor  fuel-air  mixture. 

274.  Identify  the  Idle  adjustment  screw. 

275.  Identify  the  main  adjustment  screw. 

353.  Method  for  adjustment  of  the  Ignition  timing. 


Task  322  Research  TO's  for  maintenance  Instructions  on  engines,  motors, 
or  generators  * 


ITEM  NUMBER 
38. 

229. 

355. 


KNOHLEOGE  MEASURED 

Identify  information  Included  under  a  technical  order 
series  number. 

Information  Included  In  technical  order  sections. 
Define  Illustrated  parts  breakdown  (IPB). 


Task  272  Perform  TO  amdlflcatlons  on  engines,  motors,  or  generators 


ITEM  NUMBER  KNOHLEOGE  MEASURED 

65.  Define  technical  order  modification. 

355.  Define  illustrated  parts  breakdown  (IPB). 


Task  255  Change  generators  or  alternators  * 


ITEM  NUMBER 

40. 

41. 

42. 

43. 

68. 


77. 

100. 

109. 

127. 

153. 

195. 

294. 

295. 
320. 
332. 
336. 


KNOHLEOGE  MEASURED 
Identify  the  end  bell. 

Identify  the  end  bell  band. 

Identify  the  armature. 

Identify  the  stator. 

Correct  position  of  brushes  when  Installing  on  a 
generator. 

Procedure  for  key  after  removal  from  generator. 

Method  for  rotating  the  armature  when  cleaning. 
Required  times  for  disconnecting  battery. 

Identify  the  fraam  assembly. 

Procedure  for  removal  of  the  armature  shaft. 

Procedure  for  relnstallatlon  of  the  armature. 
Precautions  taken  with  brushes  when  being  removed  from 
the  alternator. 

Use  of  a  torque  wrench. 

Number  of  threads  that  should  protrude  past  a  nut. 
Location  of  the  cwitrol  box. 

Reattachment  of  the  ground  wire  during  relnstallatlon 
of  the  generator. 
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Task  299  Remove  or  Install  engines,  motors,  or  generators 


ITEM  NUHBER  KNOWLCO&E  MEASURED 

96.  Precautions  when  working  on  a  B>5  maintenance  stand. 

320.  Number  of  threads  that  should  protrude  past  a  nut. 

333.  Frequency  of  operational  Inspection  of  shop  support 
equipment. 


Task  273  Prepare  engines,  motors,  or  generators  for  storage 


ITEM  NUMBER  KN0WLE06E  MEASURED 

1.  Procedures  for  preventing  corrosion. 

90.  Define  pickling  oil. 

92.  Removal  of  oil,  air  pressure,  and  fuel  when  preparing 
a  unit  for  storage. 

214.  Identify  the  AFTO  form  number  for  the  Equipment  Status 
form. 

261.  Clean  an  atomizer  assembly. 

323.  When  corrosion  prevention  methods  are  used. 


Task  298  Remove  or  Install  engine,  motor,  or  generator  baffles  or  shrouds 


ITEM  NUMBER  KHOWLEOBE  MEASURED 

113.  location  of  fan  guards. 
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Task  269  Perforn  cylinder  conpresslon  tests 


ITEM  NUMBER  KN0HLE06E  MEASURED 

85.  Appropriate  use  for  compression  tester  kit. 
93.  Define  PSI. 

123.  Define  firing  order. 

216.  When  grounding  a  unit  Is  necessary. 

231.  Define  compression. 

233.  What  PSI  symbolizes. 

248.  Define  mechanical  Injector. 

259.  Inspection  of  spark  plugs  for  Irregularities. 

291.  Define  what  a  magneto  supplies. 


Task  281  Remove  or  Install  engine  cylinder  head  assemblies 


ITEM  NUMBER  KNQULEDfiE  MEASURED 

70.  Identify  governor  types. 

94.  Procedure  for  Installation  of  seals  and  gaskets  on  a 
cylinder  block. 

104.  Identify  the  water  manifold. 

128.  Correct  torque  sequence  when  reinstalling  a  head 
assembly. 

146.  Components  removed  before  removal  of  the  cylinder  head 
assembly. 

152.  Location  of  the  cylinder  head  In  relation  to  other 
components. 

159.  Identify  the  location  of  the  carburetor. 

175.  Identify  the  push  rod. 

176.  Identify  the  rocker  arm. 

180.  When  removal  of  enclosure  assembly  Is  necessary. 

201.  Use  of  new  packing. 

248.  Define  mechanical  Injector. 

295.  Use  of  a  torquo  wrench. 

301.  Inspection  of  counterbores  on  cylinders. 

332.  Location  of  the  control  box. 

337.  Safety  reasons  for  disconnecting  fuel  lines. 

342.  Precautions  taken  when  removing  the  air  Intake 
amnlfold. 

372.  Identify  the  piston  crown. 
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Task  260  Clean  motor  or  generator  armatures  * 


ITEM  NUMBER 
81. 

100. 

109. 

300. 

349. 

310. 


KNOWLEDGE  MEASURED 

Procedure  for  removal  of  generator  cover  for  cleaning 
purposes. 

Method  for  rotating  the  anaature  when  cleaning. 
Required  times  for  disconnecting  battery. 

Requirements  for  wearing  protective  gear. 

Use  of  coanwtator  stone. 

Necessity  of  cleaning  slip  rings  after  cleaning  the 
commutator. 


Task  283  Remove  or  Install  engine  exhaust  manifolds,  seals,  gaskets,  or 
coMon  hardware 


ITEM  NUMBER 
159. 
164. 


180. 

292. 

295. 

316. 

352. 

374. 


KNOWLEDGE  MEASURED 
Identify  location  of  the  carburetor. 

Number  of  studs  which  hold  the  exhaust  manifold  In 
place. 

When  removal  of  enclosure  assembly  Is  necessary. 
Procedure  for  Installation  of  manifold  gaskets  and 
seals. 

Use  of  a  torque  wrench. 

Clamp  types. 

Procedure  for  preparing  surface  of  carburetor  and 
manifold  for  Installation  of  gaskets. 

Location  of  the  exhaust  manifold. 


Task  289  Remove  or  Install  engine  Intake  manifolds,  seals,  gaskets,  or 
common  hardware 


ITEM  NUMBER 
109. 
295. 
320. 
342. 


352. 


KNOWLEDGE  MEASURED 

Required  times  for  disconnecting  battery. 

Use  of  a  torque  wrench. 

Number  of  threads  that  should  protrude  past  a  nut. 
Precautions  taken  when  removing  the  air  Intake 
manifold. 

Procedure  for  preparing  surface  of  carburetor  and 
manifold  for  Installation  of  gaskets. 
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Task  314  Remove  or  Install  starters 


ITEM  NUHBER  KNOWLEOfiE  MEASURED 

20.  Identify  location  of  starter. 

218.  Isolation  of  battery  when  troubleshooting  pneumatic 
systems  and  starters. 

264.  Method  for  Insuring  serviceability  of  the  solenoid 
coll. 

295.  Use  of  a  torque  i^nch. 


Task  228  Remove  or  Install  gauges 


ITEM  NUMBER  ''.NQULEOGE  MEASURED 

302.  Identification  of  gages  needing  Immediate  replacement. 


Task  199  Clean  Indicator  light  receptacles  or  connectors 


ITEM  NUMBER  KNOULEOGE  MEASURED 

46.  Identify  the  correct  cleaner  for  Indicator  light 
receptacles  or  connectors. 

276.  Clean  load  contactors. 


Task  239  Straighten  Indicator  light  receptacles  or  connectors 


ITEM  NUMBER  KNOWLEDGE  MEASURED 

109.  Required  times  for  disconnecting  battery. 

115.  Method  for  straightening  pins  on  a  light  receptacle  or 
contactor. 


Task  284  Remove  or  Install  engine  fan  belts  * 


ITEM  NUHBER  KNOHLEOGE  MEASURED 

72.  Method  for  checking  belt  tension. 

109.  Required  times  for  disconnecting  battery. 

166.  Procedure  for  taking  up  slack  In  the  fan  belt. 

207.  Procedure  for  removal  of  a  fan  belt. 

263.  Deflection  allowed  In  drive  belts. 
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Task  28S  Remove  or  Install  engine  flywheels 


ITEM  NUMBER  XNOWLEDBE  MEASURED 

130.  Appropriate  application  of  anti-seize  compound. 

143.  Correct  position  of  capscrews  when  Installing  a 
flywheel. 

210.  Use  of  the  flywheel  lifting  tool. 

227.  Relation  of  flywheel  housing  to  the  flywheel  assembly. 

247.  Identify  the  crankshaft. 

271.  What  components  must  be  separated  for  removal  of  the 
flywheel. 

286.  Location  of  the  flywheel. 

289.  Alignment  of  flywheel  bolt  holes  and  crankshaft  bolt 
holes. 

295.  Use  of  a  torque  wrench. 

333.  Frequency  of  iM>erat1ona1  Inspection  of  shop  support 
equipment. 


Task  293  Remove  or  Install  engine  oil  pressure-operated  switches 


ITEM  NUMBER  KNONLEOfiE  MEASURED 

119.  Define  tag  leads. 

211.  Location  of  the  oil  pressure  override  button. 

241.  Define  schematic  diagram. 

313.  Location  of  oil  system  components. 


Task  296  Remove  or  Install  engine  thermostats 


ITEM  NUMBER  UNMLEDCE  MEASURED 

102.  Identify  the  thermostat. 

103.  Identify  the  by-pass  tube. 

104.  Identify  the  water  manifold. 

105.  Identify  the  Mter  outlet  elbow. 

351.  Procedure  for  draining  the  radiator. 
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Task  317  Remove  or  Install  turbine  engine  combustor  cans 


ITEM  NUMBER  KNOHLEOGE  MEASURED 

34.  Use  of  various  lockwiring  (safety  wiring)  methods. 
39.  Inspection  procedure  for  the  V-band  clamp. 

91.  Identify  times  when  wearing  gloves  Is  required. 

97.  Procedure  for  capping  lines. 

209.  Periodic  Inspection  of  the  combustor  cap  of  the  ~60 
generator. 

249.  Ident'^y  the  flame  tube  assembly  as  part  of  the 
combustion  can. 

257.  Inspection  of  the  Ignitor  plug. 

295.  Use  of  a  torque  wrench. 

303.  Location  of  atomizer. 

350.  Location  of  the  combustor  cap  and  surrounding 
components. 


Task  316  Remove  or  Install  turbine  engine  atomizers 


ITEM  NUMBER  KNOWLEDGE  MEASURED 

34.  Use  for  various  lockwiring  (safety  wiring)  methods. 
114.  Describe  hose  assembly  at  the  atomizer. 

119.  Define  tag  leads. 

201.  Use  of  new  packing. 

209.  Periodic  Inspection  of  the  combustor  cap  portion  of 
the  -60  generator. 

249.  Identify  the  flame  tube  assembly  as  part  of  tha 
combustion  can. 

257.  Inspection  of  the  Ignitor  plug. 

295.  Use  of  a  torque  wrench. 

303.  Location  of  atomizer. 

311.  Unit  for  which  a  POD  check  Is  required. 


Task  261  Clean  turbine  engine  atomizers 


ITEM  NUMBER  KNOWLE:  E  MEASURED 

249.  Identify  the  flame  tube  assembly  as  part  of  the 
combustion  can. 

261 .  Clean  an  atomizer  assembly. 

350.  Location  of  the  combustor  cap  and  surrounding 
components. 
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Task  31 S  Remove 

or  Install  turbine  engine  atomizer  components 

ITEM  NUMBER 

KNOWLEDGE  MEASURED 

201. 

Use  of  new  packing. 

217. 

Identify  components  Inside  of  an  Ignition  breaker 
assembly. 

249. 

Identify  the  flame  tube  assembly  as  part  of  the 
combustion  can. 

295. 

Use  of  a  torque  wrench. 

303. 

Location  of  atomizer. 

350. 

Location  of  the  combustor  cap  and  surrounding 
components. 

Task  160  Perform  gas  turbine  compressor  visual  or  service  Inspections 

ITEM  NUMBER 

KNOWLEDGE  MEASURED 

62. 

Procedure  for  checking  equipment  forms. 

67. 

Inspection  of  a  bleed  air  hose. 

156. 

Method  for  straightening  doors. 

165. 

Safety  of  operation  vehicle  Inspection. 

185. 

Service  Inspection  on  an  A/N32A-86  generator. 

192. 

Configuration  of  an  ignition  switch  when  performing  a 
voltage  check. 

208. 

Service  Inspection  of  a  gas  turbine  compressor. 

214. 

Identify  the  AFTO  form  number  for  the  Equipment  Status 
form. 

225. 

Service  Inspection  of  the  manifold  block  on  a 
hydraulic  test  stand. 

230. 

Operational  Inspection  of  lights. 

245. 

Procedure  for  checking  the  parking  brake. 

252. 

Interpretation  of  an  oil  dip  stick  reading. 

282. 

Use  of  multimeter  scale  for  performing  a  continuity 
check. 

300. 

Requirements  for  wearing  protective  gear. 

340. 

Procedure  for  checking  protective  tray  lamps. 

343. 

Inspection  of  meters  and  gages. 
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Task  179  Perform  gas  turbine  compressor  periodic  Inspections  * 


ITEM  NUMBER 
34. 

60. 

97. 

114. 

118. 

191. 

197. 

205. 

209. 


210. 

215. 

232. 

249. 

257. 

261. 

300. 

1.03. 

307. 

310. 

350. 


368. 


KNOWLEOfiE  MEASURED 

Use  for  various  lockwiring  (safety  wiring)  methods. 

Use  of  feeler  gauge. 

Procedure  for  capping  lines. 

Describe  hose  assembly  at  the  atomizer. 

Grounding  of  the  magneto  lead  to  the  engine  block 
during  a  Periodic  Inspection  on  a  Packette  engine. 
Cleaner  for  the  fuel  pump. 

Procedure  for  grounding  a  unit. 

Appropriate  use  of  compressed  air. 

Periodic  Inspection  of  the  combustor  cap  of  the  -60 
generator. 

Use  of  new  packing. 

Inspection  of  the  *T-boU*  of  a  V-band  clamp. 

Periodic  Inspection  of  cracks  on  a  flame  tube  assembly. 
Identify  the  flame  tube  assembly  as  part  of  the 
combustion  can. 

Inspection  of  the  Ignitor  plug. 

Clean  an  atomizer  assembly. 

Requirements  for  wearing  protective  gear. 

Location  of  atomizer. 

Procedure  for  the  0-r1ng  seal  before  replacing  the 
atomizer  screen. 

Procedure  for  removal  of  the  atomizer  screen. 

Location  of  the  combustor  cap  and  surrounding 
components. 

Meaning  of  Impulse  coupling  snapping. 


Task  250  Adjust  turbine  engine  bleed  air  system  components 


ITEM  NUMBER 
29. 

34. 

117. 

300. 

365. 


KNOWLEDGE  MEASURED 
Use  for  a  receiver  air  gauge. 

Use  for  various  lockwiring  (safety  wiring)  methods. 
Identify  the  A.S.S.  valve. 

Requirements  for  wearing  protective  gear. 
Precautions  taken  opening  a  pneumatic  line. 


Task  439  Adjust  bleed  air  load  control  valves 


ITEM  NUMBER 
171. 

220. 


KNOWLEDGE  MEASURED 

Method  of  measuring  the  rate  of  opening  time  for  a 
load  control  valve  on  a  gas  turbine  compressor  unit. 
Define  how  the  plane  of  rotation  Is  Identified. 
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Task  475  Isolate  brake  system  malfunctions 


ITEM  NUMBER 
13. 

23. 

24. 

25. 

26. 
121. 
126. 
138. 

213. 

237. 

245. 

266. 


KNOWLEDGE  MEASURED 
Identify  the  hub. 

Identify  the  brake  lever. 

Identify  the  dust,  cover/adjustment  cover. 

Identify  brake-shoe  and  lining. 

Identify  the  backing  plate. 

Inspection  of  a  brake  assembly  cam  shaft. 

Identify  the  brake  assembly. 

Procedure  for  removing  glazed  spots  from  brake  shoe 
lining. 

Inspection  of  cables  on  a  generator  set. 

Location  for  positioning  Jack  stands. 

Procedure  for  checking  parking  brake. 

Method  of  Jacking  a  unit. 


Task  503 


Research  TO's.  charts,  or  diagrams  for  AGE  enclosures,  chassis, 
or  drives  * 


ITEM  NUMBER 
38. 

229. 

355. 


KNOWLEDGE  MEASURED 

Identify  Information  Included  under  a  technical  order 
scries  number. 

Information  Included  In  technical  order  sections. 
Define  Illustrated  parts  breakdown  (IPB). 


Task  481  Perform  TO  modifications  on  enclosures,  chassis,  or  drives 


ITEM  NUMBER  KNOWLEDGE  MEASURED 

65.  Define  technical  order  modification. 


Task  493  Remove  or  Install  enclosure  assemblies 


ITEM  NUMBER 
76. 
180. 
333. 


KNOWLEDGE  MEASURED 
Identify  purpose  of  a  baffle. 

When  removal  of  enclosure  assembly  Is  necessary. 
Frequency  of  operational  Inspection  of  shop  support 
equipment. 
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Task  498  Remove  or  Install  steering  system  components 


ITEM  NUMBER 

124. 

125. 

237. 

245. 

266. 

373. 

KNOWLEDGE  MEASURED 

Identify  the  front  end  assembly. 

Identify  the  U-bolts. 

Location  for  positioning  Jack  stands. 

Procedure  for  checking  parking  brake. 

Method  of  jacking  a  unit. 

Procedure  for  Installation  of  wheel  half  nuts,  bolts, 
and  washers. 

Task  499  Remove 

or  Install  steering  system  component  parts 

ITEM  NUMBER 

KNOWLEDGE  MEASURED 

5. 

Identify  the  tongue  assembly. 

6. 

Identify  the  king  pin. 

8. 

Identify  the  axle. 

9. 

Identify  the  tie  rod. 

10. 

Identify  the  ball  Joint. 

11. 

Identify  the  wheel  spindle. 

12. 

Identify  the  draw  bar  or  tow  bar. 

13. 

Identify  the  hub. 

16. 

Identify  the  appropriate  location  for  a  bushing. 

Task  238  Splice  electrical  system  wiring  * 


ITEM  NUMBER 

KNOWLEDGE  MEASURED 

2. 

Procedure  for  crimping  a  connector  with  a  crimping 
tool. 

28. 

Identify  the  splicing  method  which  uses  a  barrel 
splice. 

44. 

Procedure  for  heat  shrink  Insulation  before  applying 
solder. 

51. 

Describe  procedure  for  crimping  a  connector. 

156. 

Use  of  solder  gun  vs.  solder  Iron. 

187. 

Procedure  for  securing  heat  shrink  Insulation  In  place. 

242. 

Define  barrel  splice  vs.  soldering  splice. 

268. 

Method  for  stripping  wire. 

318. 

Twist  and  tin  lead  wires  before  Inserting  them  Into 
splice. 

358. 

Procedure  for  applying  flux  to  a  conductor  when 
soldering. 
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ITEM  NUMBER  KNOHLEOBE  MEASURED 

144.  Identify  Mter  wtilcli  measures  current  flow. 

194.  Use  of  mltlmeter. 

241.  Define  scheoutlc  dlagrafli. 


Task  209  Measure  resistance  of  ABE  electrical  systean  other  than 
Integrated  or  solid  state  circuitry  * 


ITEM  NUMBER  KNOMLEDBE  MEASURED 

109.  Required  tines  for  disconnecting  battery. 

112.  Conducting  continuity  checks. 

139.  Identify  Technical  Order  for  the  NC>2A  Davey. 

144.  Identify  meter  which  measures  current  flow. 

163.  Purpose  of  performing  a  continuity  check. 

192.  Configuration  of  an  Ignition  switch  when  performing  a 
voltage  check. 

194.  Use  of  multimeter. 

241.  Define  schematic  diagram. 

262.  Checking  for  power  at  the  Ignition  coll. 

212.  Define  Ignition  coll. 

317.  Method  for  continuity  checks  In  an  Ignition  system. 
334.  Isolate  and  perform  a  continuity  check  on  the  Ignition 
system  resistor. 

366.  Define  Ohms. 
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Task  203  Isolate  malfunctions  within  electrical  circuitry  other  than 
Integrated  or  solid  state  circuitry 


ITEM  NUMBER  KNOWLEDGE  MEASURED 

78.  Define  current. 

109.  Required  times  for  disconnecting  battery. 

144.  Identify  meter  which  measures  current  flow. 

194.  Use  of  multimeter. 

203.  Define  amps. 

216.  When  grounding  a  unit  Is  necessary. 

241.  Define  schematic. 

282.  Use  of  multimeter  scale  for  performing  a  continuity 
check. 

339.  Knowledge  required  If  allowed  to  trnrk  with  high 
voltage. 

366.  Define  Ohms. 


Task  227  Remove  or  Install  electrical  system  components  other  than 
integrated  or  solid  state  circuitry 


ITEM  NUMBER  KNOWLEDGE  MEASURED 

109.  Required  times  for  disconnecting  battery. 

119.  Define  tag  leads. 

170.  Define  solenoid. 

207.  Procedure  for  removal  of  the  fan  belt. 

213.  Inspection  of  cables  on  a  generator  set. 

241.  Define  schematic  diagram. 

314.  Procedure  for  removing  the  batt«;ry. 

321.  Method  for  Insuring  serviceability  of  the  solenoid 

coll. 

322.  Define  relay. 

328.  Interpretation  of  the  wire  designation  numbers. 

340.  Procedure  for  checking  protective  tray  lamps. 


Task  236  Research  TO's,  charts,  or  diagrams  for  electrical  maintenance 
Instructions 


ITEM  NUMBER  KNOWLEDGE  MEASURED 

38.  Identify  Information  Included  under  a  technical  order 
series  number. 

229.  Information  Included  In  technical  order  sections. 

3SS.  Define  Illustrated  parts  breakdown  (IPB). 
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Task  21S  Perform  AGE  electrical  system  operational  checks  other  than 
Integrated  or  solid  state  circuitry  * 


ITEM  NUMBER  KNONLEOGE  MEASURED 

31.  Correct  position  of  contactor  switch  when  putting  a 

load  on  line  for  an  operational  Inspection. 

32.  Precaution  that  a  load  bank  must  be  grounded  prior  to 

load  banking  any  generator  set. 

52.  Position  of  phase  selector  switch  for  performance  of 
an  electrical  system  operational  check. 

133.  Method  for  opening  switches  after  load  banking  any 
generator  set. 

144.  Identify  meter  which  B»asures  current  flow. 

151.  Interpretation  of  the  AC  contactor  Indication  light. 
198.  Rotation  of  the  phase  selector  knob. 

296.  Purpose  of  monitoring  the  E6T  gage. 

30e.  Reason  for  monitoring  unit  and  load  bank  gages  while 
load  banking  a  unit. 


Task  226  Remove  or  Install  cannon  plugs 


ITEM  NUMBER  KNfWLEOGE  MEASURED 

109.  Required  times  for  disconnecting  battery. 

156.  Use  of  solder  gun  vs.  solder  Iron. 

194.  Use  of  multimeter. 

246.  Define  cannon  plug. 

258.  Use  of  padded  channel  locks. 

318.  Twist  and  tin  lead  wires  before  Inserting  them  Into 
splice. 

358.  Procedure  for  applying  flux  to  a  conductor  when 
soldering. 


Task  237  Solder  electrical  system  wiring 


ITEM  NUMBER  MNMLEOGE  MEASURED 

28.  Identify  the  splicing  method  which  uses  a  barrel 
splice. 

173.  Application  of  flux  when  soldering  spliced  wires. 

242.  Define  barrel  splice  vs.  soldering  splice. 

318.  Twist  and  tin  lead  wires  before  Inserting  them  Into 

splice. 

358.  Procedures  for  applying  flux  to  a  connector  when 
soldering. 


Task  225  Raiaove  or  Install  cannon  plug  parts 

ITEM  NUMBER 
109. 

156. 

194. 

202. 

246. 

258. 

318. 

358. 

KNOHLEDfiE  MEASURED 

Required  tiaas  for  disconnecting  battery. 

Use  of  solder  gun  vs.  solder  Iron. 

Use  of  awltlieeter. 

Identify  the  groenet. 

Define  cannon  plug. 

Use  of  padded  channel  locks. 

Twist  and  tin  lead  wires  before  Inserting  then  Into  a 
splice. 

Procedure  for  applying  flux  to  connector  when 
soldering. 

Task  235  Reaova  or  Install  voltage  regulators 

ITEM  NUNBER 
109. 

241. 

375. 

KNOHLEDfiE  NEASURED 

Required  tines  for  disconnecting  battery. 

Define  schanatic  diagran. 

Procadura  for  renova)  of  the  voltage  regulator  fron 
the  -60  generator  set. 

Task  494  Raaova  or  Install  hinges,  stays,  or  fasteners 

ITEM  NUNBER 

KNOHLEDfiE  MEASURED 

79. 

Use  of  drill. 

108. 

Method  for  ranoval  of  rivets. 

150. 

Method  for  attaching  fasteners. 

300. 

Raquiranants  for  wearing  protective  gear. 

311. 

Unit  for  which  a  FOD  check  Is  required. 

Task  504  Straighten  panels,  doors,  or  covers 

ITEM  NUNBER 

KNOHLEDfiE  NEASURED 

158. 

Method  for  straightening  bent  doors. 

Task  478  Paint,  standi,  or  Mark  A6C 


ITCH  NUNBER  KWMLEOSE  NEASURED 

30.  Tiaia  wtian  wtarinfl  a  rasplrator  Is  raquirtd. 

187.  Dtfint  fitid  nwibar. 

258.  Procadura  for  application  of  flald  nuabars  to  units. 
283.  Paint  colors  usad  for  diffarant  catagorlas  of 
Infonaitlon. 
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Task  464  Reflectorize  AGE 


ITEM  NUMBER  KNOWLEDGE  MEASURED 

89.  Appropriate  application  of  reflective  tape  to  AGE 
vehicles. 


Task  482  Prepare  AGE  for  painting  except  magnesium  housings 


ITEM  NUMBER 
251. 

300. 


KNOWLEDGE  MEASURED 

Procedure  for  sanding  a  unit  In  preparation  for 
painting. 

Requirements  for  wearing  protective  gear. 


Task  555  Prepare  AGE  for  mobility  or  training  exercises  * 


ITEM  NUMBER 
21. 

33. 

81. 

109. 

176. 

188. 

235. 


KNOWLEDGE  MEASURED 

Requirements  for  preparing  a  unit  for  shipping. 
Procedure  for  preparing  lights  on  a  light  cart  for 
shipment. 

Appropriate  fuel  level  of  a  unit  when  air  shipping. 
Required  times  for  disconnecting  battery. 

Procedure  for  preparation  of  tires  for  air  shipment. 
Procedure  for  storing  light  cables  of  an  NF-2  when 
preparing  for  mobility  and  training. 

Documentation  shipped  with  AGE  units  for  mobility. 


Task  275  Remove  or  Install  carburetors  * 


ITEM  NUMBER 
109. 
159. 
239. 

270. 

315. 

337. 

347. 

352. 


KNOWLEDGE  MEASURED 

Required  times  for  disconnecting  battery. 

Identify  the  location  of  the  carburetor. 

Connection  of  the  governor  linkage  after  Installation 
of  a  carburetor. 

Periodic  Inspection  of  the  oil  bath  air  cleaner. 

Procedure  for  the  choke  cable  when  removing  a 
carburetor. 

Safety  reasons  for  disconnecting  fuel  lines. 

Location  of  the  air  cleaner. 

Procedure  for  pr^rlng  surface  of  carburetor  and 
manifold  for  Insta.iatlon  of  gaskets. 
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Task  286  Renove  or  Install  engine  fuel  punps  * 


ITCH  NUHBER 
57. 

66. 

109. 

131. 


KNOHLCOeC  MEASURED 
Identify  location  of  the  fuel  pump. 

Procedure  for  reewval  of  fuel  punp> 

Required  tines  for-  disconnecting  battery. 

Insure  proper  alignment  of  can  am  and  pump  before 
Installation  of  a  fuel  pump. 

Cleaner  for  the  fuel  punp. 

Location  of  components  around  the  fuel  punp. 

Close  fuel  shut-off  valve  before  removing  fuel  punp. 
Procedure  for  Installation  of  a  new  fuel  punp. 
Procedure  for  preparing  surface  of  carburetor  and 
manifold  for  Installation  of  gaskets. 

Procecure  for  disconnecting  fuel  lines  when  removing 
the  fuel  punp. 


Renove  or  Install  fuel  lines  or  fittings  other  than  diesel  * 


ITEM  NUMBER  KN0ULED6E  MEASURED 

73.  Method  for  placing  a  tube  In  the  holding  bar  of  a 

flaring  tool. 

74.  Use  of  flaring  tool  to  make  flare  on  tube. 

299.  Use  of  a  deburring  tool. 

346.  Procedure  for  Installation  of  a  new  fuel  punp. 


Task  263  Fabricate  engine  fuel  lines 


ITEM  NUMBER  KN0HLE06C  MEASURED 

36.  Identify  correct  angle  used  when  cutting  tubing. 

73.  Method  for  placing  a  tube  In  the  holding  bar  of  a 

flaring  tool. 

74.  Use  of  a  flaring  tool  to  make  flare  on  tube. 

93.  Otfine  PSI. 

134.  Use  of  tubing  cutters. 

155.  Periodic  Inspection  on  fuel  lines. 

193.  Use  of  tubing  benders. 

233.  Nhat  PSI  symbolizes. 


88 


Task  248  Adjust  reciprocating  engine  fuel  system  components 

ITEH  NUMBER 

KN0WLE06E  MEASURED 

159. 

Identify  the  location  of  the  carburetor. 

297. 

Procedure  for  setting  the  Idle  mixture  adjustment. 

Task  251  Adjust  turbine  engine  fueV system  components  * 

ITEH  NUMBER 

KN0HLE06E  MEASURED 

34. 

Use  for  various  lockwiring  (safety  wiring)  methods. 

109. 

Required  times  for  disconnecting  battery. 

199. 

Procedure  for  adjusting  cracking  pressure. 

255. 

Procedure  taken  before  adjusting  cracking  pressure. 

303. 

Location  of  atomizer. 

304. 

Procedure  for  bleeding  air  from  a  hydraulic  line. 

337. 

Safety  reasons  for  disconnecting  fuel  lines. 

Task  4B7  Remove  or  Install  ABE  fuel  tanks  or  components 

ITEM  NUMBER 

MKMLEOBE  MEASURED 

264. 

Define  fog  a  fuel  tank. 

Task  483  Purge  fuel  tanks 

ITCH  miNBER  KNOHLCOfiC  HEASURCD 

ZM.  Define* fog  «  fuel  tenkT 


Tesk  142  Hake  entries  on  AFTO  form  349  (mlntenance  data  collection 
record) 


ITCH  NUMBER 

KNOWLCDBE  MEASURED 

27. 

Identify  the  'How  HAL  Code.* 

47. 

Describe  the  Information  which  goes  In  the  'Quantity* 
block  on  AFTO  Fona  350. 

75. 

Identify  a  work  center  code. 

80. 

Define  ID  number. 

82. 

Identify  a  type  maintenance  cede. 

181. 

Identify  the  action  taken  code. 

306. 

Define  the  J(Mi  control  number. 

319. 

Identify  the  work  unit  code. 

344. 

Describe  some  discrepancies  for  AFTO  Form  244. 

363. 

Procedure  for  writing  time  and  date  on  AFTO  Form  349. 
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Task  143  Hake  entries  on  AFTO  fonts  3S0  (reparable  Item  processing  tag) 


ITEM  NUMBER 
27. 

47. 


80. 

82. 

96. 


160. 

238. 

250. 

253. 

290. 

306. 

319. 

344. 


KNOWLEDGE  MtASURED 
Identify  the  *Now  MAL  Code.* 

Describe  the  Information  which  goes  In  the  'Quantity* 
block  on  AFTO  Form  350. 

Define  ID  number. 

Identify  a  type  maintenance  code. 

Identify  location  of  the  Federal  Stock  Class  number  on 
AFTO  Form  350. 

Identify  the  Standard  Reporting  Description  (SRO). 
Define  nomenclature. 

Identify  the  national  stock  number  (NSN). 

Identify  the  when  discovered  code. 

Identify  an  organization  code. 

Define  the  job  control  number. 

Identify  a  work  unit  code. 

Describe  some  discrepancies  for  AFTO  Form  244. 


Task  108  Maintain  AFTO  form  244  and  AFTO  form  245  (system/equipment 
status  record  and  continuation  sheet) 


ITEM  NUMBER 
75. 

80. 

154. 

167. 

222. 

250. 

306. 

324. 

342. 

357. 


KNOWLEDGE  MEASURED 
Identify  a  work  center  code. 

Define  ID  nwSber. 

Procedure  for  noting  the  carry  forward  discrepancy  on 
AFTO  Form  244. 

Define  field  number. 

Completion  of  the  non-scheduled  Inspection  section  of 
AFTO  Form  244. 

Identify  the  national  stock  number  (NSN). 

Define  the  job  control  number. 

Identify  the  registration  number. 

Identify  the  work  unit  code. 

Identify  the  condition  code. 
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Task  120  Make  entries  on  AF  Form  2005  (Issue/tum-ln  request)  * 


ITEM  NIMBEii 
4. 

49. 

80. 

101. 

149. 

290. 

335. 

357. 

363. 

367. 


KNOWLE06C  MEASURED 

Identify  colors  of  tags  used  for  NRTS  Items. 

Identify  the  activity  code. 

Define  ID  number. 

Define  serviceability. 

Define  condemned. 

Identify  an  organization  code. 

Identify  the  shop  code. 

Identify  the  condition  code. 

Procedure  for  writing  time  and  date  on  AFTO  Form  349. 
Identify  the  sequence  code. 


Task  162  Perform  hydraulic  test  stand  visual  or  service  Inspections  * 


ITEM  NUMBER 
55. 

59. 

62. 

87. 

108. 

110. 

206. 

212. 

214. 

225. 

245. 


KNOHLEOeE  MEASURED 

Identify  location  of  a  hydraulic  test  stand. 

Define  high  pressure  relief  valve. 

Procedure  for  checking  equipment  forms. 

Procedure  for  performing  an  operational  check  on  the 
flow  control  valve. 

Correct  hydraulic  reservoir  fluid  level  on  a  hydraulic 
test  stand. 

Method  for  checking  fuel  level. 

Service  Inspection  of  external  hoses  on  a  hydraulic 
test  stand. 

Frequency  of  an  operational  Inspection  of  a  hydraulic 
test  stand. 

Identify  the  AFTO  number  on  the  Equipment  Status  form. 

Service  Inspection  of  the  manifold  block  on  a 
hydraulic  test  stand. 

Procedure  for  checking  the  parking  brake. 
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Task  181  Perform  hydraulic  test  stand  periodic  Inspections  * 


I 


ITEM  NUMBER 

XNOWLEOGE  MEASUREO 

34. 

Use  for  various  lockwiring  (safety  wiring)  methods. 

109. 

Required  times  for  disconnecting  battery. 

201. 

Use  of  new  packing. 

228. 

Procedure  for  relelving  system  pressure  on  a  hydraulic 

test  stand. 

280. 

Order  of  parts  In  the  filter  assembly  of  a  hydraulic 

test  stand. 

285. 

Periodic  Inspection  on  a  hydraulic  test  stand  high 

pressure  filter  assembly. 

360. 

Procedure  for  low  pressure  filter  assembly  of  a 

hydraulic  test  stand  during  a  periodic  Inspection. 

Task  407  Perform  AGE  hydraulic  system  operational  checks 

ITEM  NUMBER 

KNOHLEOGE  MEASUREO 

30. 

Time  when  wearing  a  respirator  Is  required. 

48. 

Interpretation  of  the  warning  horn  on  a  hydraulic  test 

stand. 

136. 

Method  for  checking  the  emergency  shut  down  lever. 

312. 

Identify  the  pressure  compensator  valve. 

361. 

Position  of  the  reservoir  selector  valve  during  a  fill 

and  bleed  operation. 

371. 

Define  the  volume  control  valve. 

Task  421  Remove 

or  Install  hydraulic  lines  or  fittings  * 

ITCN  NUMBER 
3. 

71. 

189. 

304. 

330. 


RNOHLCDGE  MEASUREO 

Procedure  for  reMovInu  e  hose  from  a  hydraulic  pump 
and  ram. 

Method  for  senrlclnp  a  reservoir  In  a  hydraulic  system. 
Position  of  drip  pan  when  In  use. 

Procedure  for  bleeding  air  from  a  hydraulic  line. 
Hydraulic  fluid  type  for  a  unit. 
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Task  405  Drain,  flush,  and  refill  AGE  hydraulic  reservoirs 


ITEM  NUMBER 
30. 

55. 

108. 

109. 

135. 

240. 

312. 

330. 

361. 

371. 


KNOWLEDGE  MEASURED 

Time  when  wearing  a  respirator  Is  required. 

Identify  location  of  a  hydraulic  test  stand. 

Correct  hydraulic  reservoir  fluid  level  on  a  hydraulic 
test  stand. 

Required  times  for  disconnecting  battery. 

Identify  the  cleaner  for  outside  of  the  hydraulic 
reservoir. 

What  Is  used  to  flush  a  contaminated  hydraulic 
reservoir. 

Identify  the  pressure  compensator  valve. 

Hydraulic  fluid  type  for  a  unit. 

Position  of  the  reservoir  selector  valve  when 
performing  a  fill  and  bleed  operation. 

Define  the  volume  control  valve. 


Task  406  Isolate  hydraulic  system  malfunctions 


ITEM  NUMBER 
SO. 


241. 

312. 

330. 

361. 

371. 


KNOWLEDGE  MEASURED 

Identify  gages  found  on  a  hydraulic  test  stand  control 
panel. 

Define  schematic  diagram. 

Identify  the  pressure  compensator  valve. 

Hydraulic  fluid  type  for  a  unit. 

Position  of  the  reservoir  selector  valve  when 
performing  a  fill  and  bleed  operation. 

Define  the  volume  control  valve. 


Task  437 


Research  TO's,  charts,  or  diagrams  for  AGE  hydraulic  systems 
maintenance  Instructions 


ITEM  NUMBER 
38. 

229. 

355. 


KNOWLEDGE  MEASURED 

Identify  Information  Included  under  a  technical  order 
series  number. 

Information  Included  In  technical  order  sections. 
Define  Illustrated  parts  breakdown  (IPB). 
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Task 

436 

Replace 

seals  or  *0*  rings  In  hydraulic  system  components 

ITEM  NUMBER 

KNOWLEDGE  MEASURED 

201. 

Use  of  new  packing. 

29B. 

Coat  hydraulic  system  0-r1ngs  with  hydraulic  fluid. 

307. 

Procedure  for  the  0-r1ng  seal  before  replacing  the 

atomizer  screen. 

Task 

39S 

Adjust  hydraulic  fill  and  bleed  systems 

ITEM  NUMBER 

KNOWLEDGE  MEASURED 

59. 

Define  high  pressure  relief  valve. 

361. 

Position  of  the  reservoir  selector  valve  when 

performing  a  fill  and  bleed  operation. 

376. 

Define  the  fill  system  relief  valve. 

Task 

S68 

Perform  general  shop  housekeeping,  such  as  cleaning  drip  pans 
and  sweeping  floors 

ITEM  NUMBER 

KNOWLEDGE  MEASURED 

142. 

Procedure  for  cleaning  the  drip  pan. 

311. 

Unit  for  which  a  FOB  check  Is  required. 

Task 

567 

Paint  shop  facilities,  such  as  desks  and  walls 

ITEM  NUMBER 

KNOWLEDGE  MEASURED 

28B. 

Use  of  a  drop  cloth. 

Task 

544 

Clean  vehicles 

ITEM  NUM8CR 
1. 

30. 

SO. 

311. 

323. 


KNQHLED6E  MEASURED 
Procedures  for  preventing  corrosion. 

T1«e  Mhen  wearing  a  respirator  Is  required. 
Procedure  for  cleaning  AGE  vehicles. 

Unit  for  which  a  POO  check  Is  required. 
When  corrosion  prevention  Methods  are  used. 
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ITEM  NUMBER  KN0HLE06E  MEASURED 

31.  Corrtct  position  of  contactor  switch  when  putting  « 
load  on  line  for  an  operational  Inspection. 

69.  Define  blowdown. 

218.  Isolation  of  battery  when  troubleshooting  pneumatic 

systems. 

228.  Procedure  for  relieving  system  pressure  on  a  hydraulic 
test  stand. 

279.  Configuration  of  switches  when  connecting  the  power 
cable  for  load  testing  a  generator  set. 

282.  Use  of  multimeter  scale  for  performing  a  continuity 
check. 

321.  Method  for  Insuring  serviceability  of  the  solenoid 
coll. 

Define  Ohms. 


366. 


Task  467  Remove  or  Install  pneumatic  system  lines  or  fittings 


ITEM  HUMBER  KNOWLEOBE  MEASURED 

97.  Procedure  for  capping  tines. 

109.  Required  times  for  disconnecting  battery. 

365.  Precautions  when  opening  a  pneumatic  line. 


Task  468  Remove  or  Install  pneumatic  system  pressure  gauges 


ITEM  NUMBER  KNOWLEDGE  MEASURED 

97.  Procedure  for  capping  lines. 

109.  Required  times  for  disconnecting  battery. 

302.  Identification  of  gages  needing  lenedlate  replacement. 
365.  Precautions  when  opening  a  pneumatic  line. 


Task  457  Remove  or  Install  pneumatic  filtering  system  components 


ITEM  NUMBER 

KNOWLEDGE  MEASURED 

109. 

Required  times  for  disconnecting  battery. 

236. 

Use  of  filter  wrench. 

265. 

Frequency  of  draining  the  moisture  separator. 

331. 

Frequency  of  changing  dehydrators. 

365. 

Precautions  when  opening  a  pneumatic  line. 

Task  152  Perform  aircraft  support  air  compressor  visual  or  service 

Inspections 

ITEM  NUMBER 
62. 

91. 

162. 

200. 

213. 

214. 

225. 

231. 

245. 

265. 

331. 


KNOWLEDGE  MEASURED 

Procedure  for  checking  equipment  forms. 

Identify  times  when  wearing  gloves  Is  required. 
Precautions  taken  when  fueling  AGE  units. 

Precaution  taken  when  removing  the  radiator  cap. 
Inspection  of  cables  on  a  generator  set. 

Identify  the  AFTO  form  number  on  the  Equipment  Status 
form. 

Service  Inspection  of  the  manifold  block  on  a 
hydraulic  test  stand. 

Define  compression. 

Procedure  for  checking  parking  brake. 

Frequency  of  draining  the  moisture  separator. 
Frequency  of  changing  dehydrators. 
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Task  171  Perform  aircraft  support  air  compressor  periodic  Inspections 


ITEM  NUMBER 
20. 

31. 

3S. 


S4. 

57. 

64. 


69. 

72. 

99. 

155. 

180. 

182. 

196. 

219. 

221. 

231. 

245. 

263. 

264. 
270. 
273. 
279. 

291. 

309. 

328. 

331. 


KN0WLE06E  MEASURED 
Identify  location  of  starter. 

Correct  position  of  contactor  switch  when  putting  a 
load  on  line  for  an  operational  Inspection. 

Identify  valves  which  must  be  closed  to  build  up 
pressure  on  an  MC-IA  air  compressor. 

Procedure  for  packing  wheel  bearings. 

Identify  location  of  the  fuel  pump. 

Method  for  preparation  of  tire  before  reassembly  of  a 
wheel. 

Define  blowdown. 

Method  for  checking  belt  tension. 

Inspect  a  battery. 

Periodic  Inspection  on  fuel  lines. 

When  removal  of  enclosure  assembly  Is  necessary. 
Frequency  of  the  lOnalcron  air  filter  element 
Inspection. 

Location  of  the  sediment  bowl. 

Periodic  Inspection  on  a  frame  assembly. 

Procedure  for  cleaning  bearings. 

Define  compression. 

Procedure  for  checking  parking  brake. 

Deflection  allowed  In  drive  belts. 

Define  fog  a  fuel  tank. 

Periodic  Inspection  of  the  oil  bath  air  cleaner. 
Adjustawnt  of  the  float  and  needle  assembly. 
Configuration  of  switches  when  connecting  the  power 
cable  for  load  testing  a  generator  set. 

Define  what  a  magneto  supplies. 

Inspection  of  pintle  hooks. 

Interpretation  of  wire  designation  numbers. 

Frequency  of  changing  dehydrators. 


Task  155  Perform  aircraft  support  load  bank  visual  or  service 
Inspections  * 


ITEM  NUMBER 
32. 

269. 

343. 


KNOWLEDGE  MEASURED 

Precaution  that  a  load  bank  must  be  grounded  prior  to 
load  banking  any  generator  set. 

Location  of  fuse  values  needed  for  a  unit. 

Inspection  of  meters  and  gages. 


97 


Task  268  Load  test  generator  sets 


ITEM  NUMBER  KNOWLEDGE  MEASURED 

32.  Precaution  that  a  load  bank  must  be  grounded  prior  to 
load  banking  any  generator  set. 

95.  Bank  used  for  a  resistive  load. 

122.  Identify  the  correct  phase  sequence  when  loading  a 
unit. 

140.  Define  Hz. 

279.  Configuration  of  switches  when  connecting  the  power 

cable  for  load  testing  a  generator  set. 

284.  Purpose  of  the  PF  meter. 


Task  548  Fuel  AGE 


ITEM  NUMBER  KNOWLEDGE  MEASURED 

107.  Identification  of  correct  fuel  for  each  unit. 

162.  Precautions  taken  when  fueling  AGE  units. 

243.  Identify  fuel  types. 


Task  549  Inspect  vehicles  for  safety  of  operation  * 


ITEM  NUMBER 
56. 

62. 

99. 

165. 

167. 

214. 

230. 

245. 

252. 

309. 

326. 

329. 


KNOWLEDGE  MEASURED 

Procedure  for  checking  the  coolant  level  In  a  sealed 
cooling  system. 

Procedure  for  checking  equipment  forms. 

Inspect  a  battery. 

Safety  of  operatimi  vehicle  Inspection. 

Define  field  number. 

Identify  the  AFTO  number  on  the  Equipment  Status  form. 
Operational  Inspection  of  lights. 

Procedure  for  checking  parking  brake. 

Interpretation  of  an  oil  dip  stick  reading. 

Inspection  of  pintle  hooks. 

Location  of  the  exhaust  system/spark  arrestor. 
Components  checked  for  leaks  during  a  safety  of 
operation  Inspection. 


Task  170  Perform  shop  support  equipment  visual  or  service  Inspections 


ITEM  NUMBER  KNOWLEDGE  MEASURED 

333.  Frequency  of  operational  Inspection  of  shop  support 
equipment. 

359.  Frequency  of  an  operational  Inspection  of  eye  washers. 
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Task  1S7  Perforw  boob  lift  visual  or  service  Inspections 

190. 

Service  Inspection  on  a  bonb  lift. 

Task  ?46  Adjust  gas  turbine  engine  governors 

ITCN  NUN8CR 
8b. 

204. 

KIIOM.C06C  NCASURCO 

Identify  location  of  the  fuel  control  cluster. 

When  observation  of  RPNs  is  lagwrtant. 

Task  24S  Adjust  gas  reciprocating  engine  governors 

ITCN  NUNBCR 
83. 

ITl. 

204. 

34$. 

RNQHLCOSC  NCASURCO 

Rroccdure  for  adjusting  spring  tension  on  a  gas 
reciprocating  engine  governor. 

Nethod  of  aeasurlng  rate  of  opening  tlae  for  a  load 
control  valve  on  a  gas  turbine  coapressor  unit. 

When  observation  of  RRNs  Is  loportant. 

Location  of  the  turbine  engine  governor. 

Task  291  Reawve  or  Install  engine  twcbanlcal  governors 

ITCN  NUNBCR 

RN0WLC06C  NCASURCO 

34$. 

Location  of  the  turbine  engine  governor. 

Task  $52  Operate  tMO-ooy  veblcle  rodfos 

ITCN  MMCR  URMlIOtf  NEASURCO 

224.  Rroptr  radio  rtsponst. 


Task  SS4  Mck  op  or  dollvar  A6C  or  A6C  parts 


ITCN  NUNBCR 

UNMLCOCC  NCASURCO 

B3. 

Oefine  pick-op  delivery  area. 

B1. 

Identify  tiaws  oben  wearing  gloves  Is  required. 

224. 

Rroper  radio  rvsponso. 

*  indicatos  tasks  idiicli  wort  tostod  In  tho  kbik-Through  Performance 
Test 
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APPENDIX  B 

EXAMPLES  OF  COMPREHENSIVE  JOB  KNOWLEDGE  TEST  ITEMS 
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1  -  A  /  B 


AEROSPACE  6R0UN0  EQUIPMENT 
6ENERAL  MECHANIC 
AFSC  4S4X1 
JOB  UKMLEOBE  TEST 


AFPT  80-423-205 
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ACROSPACe  6ROUNO  EQUIPMENT  SPECIALTY  (AFS  4S4X1) 
JOR  KNOWLEORE  TEST 


Oirtctlons: 


Turn  your  tnswtr  shout  ond  print  your  nooc  ond  tho  doto  In  tho  blocks 
providtd.  Fin  In  tho  corrospondlng  ovols.  In  tho  'Nuaorlc  Grid,*  ontor 
your  SSAN  In  positions  1  through  S.  In  tho  block  airkod  *Sox*  blockon 
tho  appropriato  oval.  In  tho  block  Mrkcd  *Codo*  fIM  In  tho  oval 
dosignatod  by  tho  tost  adalnlstrator. 

Each  ItoM  In  this  booklot  consists  of  a  quostlon  or  statoaont 
follOMOd  by  four  chelcos.  Thoro  Is  only  ono  chelco  that  answors  tho 
quostlon  or  coaplotos  tho  statooant  eerroctly.  Bo  sum  to  mad  oach 
quostlon  and  all  of  tho  cholcos  bofom  answoring.  Docldo  which  choice  Is 
correct  and  blacfcon  tho  lottor  on  your  answor  shoot  that  aatchos  tho 
letter  and  Itea  nuaber.  Horn  Is  an  exaaplo: 

112  Miat  color  Is  tho  sky? 

A.  Rod 

I.  Yellow 

C.  Blue 

0.  Gmen 

Since  the  sky  Is  blue,  the  answer  Is  C.  On  tho  saapio  answer  sheet, 
the  oval  containing  the  C  has  been  blackened. 

Be  sum  to  use  a  nunber  2  pencil  and  blacken  only  ono  oval  for  oach 
Itea.  Note  that  the  answer  sheet  has  an  *E*  msponse  whomas  the  tost 
has  no  *E*  options.  Please  be  camful  not  to  fill  In  the  lottor  *E* 

msponse  at  any  tiao.  If  you  have  to  change  an  answor,  erase  your  first 

aarfc  coapletely,  and  then  aark  your  new  choice.  Erase  any  stray  aarks 
being  camful  not  to  tear  the  answer  sheet. 

The  questions  In  this  booklet  am  to  be  answered  on  spaces  1  •  37b 
on  the  answer  sheet  you  have  been  given. 

Do  not  spend  toe  auch  tiao  on  any  ono  Itoa.  If  you  have  tmublo 

with  an  Itoa,  skip  It,  and  coao  back  to  It  after  you  finish  tho  other 

itaas.  Although  you  aay  bo  unfaalllar  with  a  task,  aako  tho  best  choice 
you  can  for  each  Itoa.  Try  to  answer  every  Itoa. 


SAMPLE  ANSWER  SHEET 


105 


This  test  has  been  dasl^nad  to  determine  the  amount  of 
knowledfe  you  have  of  the  Aerospace  firound  Equipment  Specialty 
(AFSC  4S4X1).  The  Information  collected  will  be  used  for  research 
purposes  only  and  will  have  no  effect  on  your  career.  Test 
results  will  be  available  for  your. review  after  all  tests  have 
been  administered.  If  you  would  like  to  see  your  test  results  at 
this  later  date,  please  Indicate  so  by  signing  the  TEST  RETURN 
SI6N-UP  SHEET. 


PRIVACY  ACT  STATCNCNT 

AUTHORITY;  44IISC3103,  10USC133,  10USC3012,  E09397 


The  Information  collected  by  the  answer  sheet  will  be  used  solely 
for  research  and  development  porpeses.  Use  of  the  social  security 
account  number  Is  necessary  to  make  positive  Identification  of  the 
Individual  and  records. 

Information  provided  by  respondents  will  be  treated  as  confidential 
and  will  bo  used  for  official  research  purposes  only.  Individual 
Idwtitv  win  net  bo  revealed.  The  research  Information  obtained 
will  bo  used  only  to  Improve  the  utilization  of  personnel  resources 
within  the  Armed  Forces. 

Coeporutlen  and  disclosure  of  this  Information  Is  voluntary. 

Failure  to  provide  Information  would  hinder  the  ability  of  the 
Armed  Forces  to  best  utilize  Its  personnel  resources.  Your 
cooperation  In  this  effort  Is  appreciated. 
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Which  of  the  following  Is  MOT  a  corrosion  preventive 
Maintenance  Method? 

A.  Inspection. 

B.  Cleaning. 

C.  Painting. 

D.  ReplaceMent.- 

What  Is  the  proper  criMpIng  Method  for  a  connector  when 
joining  two  wires  with  a  solderless  connector? 

A.  CriMp  wire  at  any  spot  on  either  side  of  the  connector. 

B.  CriMp  wire  half  way  between  center  and  end  on  both  sides 
of  the  connector. 

C.  CriMp  wire  only  once  at  center  of  connector. 

0.  CriMp  %r1re  at  the  outer  edge  of  both  sides  of  the 
connector. 

What  Is  the  correct  procedure  for  reMovIng  a  hydraulic  hose 
froM  the  raM  and  puMp? 

A.  Apply  one  open-end  wrench  to  the  hose  nipple  and  a 
second  to  the  hose  fitting. 

B.  Use  an  open-end  wrench  to  remove  hose  and  nipple 
assembly  as  a  unit. 

C.  Use  vise  grips  to  remove  the  hose  and  nipple  assembly  as 
a  unit. 

0.  Use  two  adjustable  wrenches  to  remove  the  hose  nipple 
and  the  hose  fitting. 

What  are  the  colors  of  the  two  tags  used  for  NRTS  Items? 

A.  Breen  and  red. 

B.  Breen  and  yellow. 

C.  Red  and  yellow. 

0.  Red  and  white. 
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5.  Identify  the  TONGUE  ASSEMBLY. 

A.  1 

B.  20 

C.  24 

0.  25 

6.  Identify  the  KING  PIN. 

A.  8 

B.  15 

C.  16 

0.  22 

7.  Identify  the  GREASE  CAP  (or  HUB  CAP). 

A.  5 

B.  7 

C.  10 
0.  12 

8.  Identify  the  AXLE. 

A.  1 

B.  14 

C.  20 

0.  24 

9.  Identify  the  TIE  ROP. 

A.  10 

B.  12 

C.  14 

0.  17 
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10.  Identify  a  BALL  JOINT. 

A.  10 

B.  11 

C.  12 

0.  15 

11.  Identify  the  WHEEL  SPINOLe. 

A.  7 

B.  17 

C.  24 

D.  25 

12.  Identify  the  OBAW  BAR  (or  TOW  BAR). 

A.  1 

B.  14 

C.  20 

0.  24 

13.  Identify  the  NUB. 

A.  S 

B.  7 

C.  13 
0.  It 

14.  Identify  a  COTTER  PIN. 

A.  3 

B.  B 

C.  16 
0.  IB 
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15.  Identify  the  WHEEL  RETAINING  NUT  (or  CASTELLATED  NUT). 


A.  4 

B.  6 

c.  n 

0.  23 


.  16.  Identify  the  location  In  which  a  BUSHING  would  be  Installed. 

A.  12 

B.  17 

C.  19 

0.  25 

17.  Ntiat  should  be  used  to  clean  contactors? 

A.  A  coarse  file. 

B.  A  burnishing  tool  and  electrical  contact  cleaner. 

C.  A  shop  rag  and  electrical  contact  cleaner. 

t  k  • 

0.  A  shop  rag  and  P0-6B0  type  II  solvent. 

IB.  What  should  be  done  to  facilitate  handling  of  the  tire  when 
changing  the  1nncr>tubc7 

A.  Install  the  deflated  inner-tube  Into  tire  assenbly. 

B.  Slightly  Inflate  Inner-tube  Inside  tire  to  prevent 
pinching. 

C.  Inflate  Inner-tube  outside  tire  to  10-15  psi. 

0.  Use  silicone  grease  on  tube  and  Inside  tire  surface  to 
ease  assenbly. 
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19.  If  the  handbrake  lever  knob  reaches  the  limit  of  Its 
adjostMent.  what  Is  the  most  conmon  method  of  making  further 
adjustieents? 

A.  Shorten  the  linkage. 

B.  Lengthen  the  linkage. 

C.  Replace  the  linkage. 

0.  Replace  the  adjustawnt  knob  mechanism. 

20.  To  what  component  Is  the  starter  secured? 

A.  Crankcase. 

B.  Engine  block. 

C.  Gear  box. 

C.  Torque  convertor. 

21.  Which  of  the  following  docs  j|SI  have  to  be  stenciled  on  a 
unit  before  the  unit  Is  shipped? 

A.  Weight  of  unit. 

B.  Center  of  balance. 

C.  Date  and  time  unit  Is  prepared  for  shipment. 

0.  Height,  length,  and  width  of  unit. 

22.  Which  of  the  following  Is  NfiX  done  to  the  radiator  during  a 
periodic  Inspection? 

A.  Pressure  test. 

B.  Check  for  obstructions. 

C.  Clean  outer  core  with  solvent  (PD'BBO  type  11). 

0.  Check  for  proper  coolant  level. 
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23.  Identify  the  BRAKE  LEVER. 

A.  2 

B.  3 

C.  5 

0.  6 

24.  Identify  the  OUST  COVCR/AOJUSTNENT  COVER. 

A.  2 

B.  4 

C.  S 

0.  6 

25.  Identify  the  BRAKE  SHOE  and  LINING. 

A.  1 

B.  4 

C.  S 

0.  7 

26.  Identify  the  BACKING  PLATE. 

A.  2 

B.  4 

C.  6 

0.  7 

27.  Mhich  of  the  folloirlng  Is  a  *How  NAL  Code?* 

A.  Q 

B.  20 

C.  020 
0.  BAD 
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28.  What  splicing  method  requires  the  use  of  a  barrel  splice? 


A.  Crimp. 

B.  Soldered  heat  shrink. 

C.  Twist  and  solder. 

D.  Wicking. 

29.  What  does  a  receiver  air  gage  measure? 

A.  Compression. 

B.  Air  Flow. 

C.  Air  Quality. 

0.  Air  Quantity 

90.  When  are  you  required  to  wear  a  respirator? 

A.  When  being  exposed  to  flammable  liquids. 

B.  When  using  pressurized  water. 

C.  When  transporting  liquid  acid  containers. 

0.  When  being  exposed  to  paint  particles. 

31.  When  performing  an  operatlimal  check  on  ABE  electrical 

systems,  where  do  you  position  the  contactor  switch  to  put 
the  load  on  line? 

A.  Open  position. 

B.  Closed  position. 

C.  Reset  position. 

0.  Neutral  position. 
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32.  Prior  to  load  banking  any  generator  set.  t/bat  safety 
precaution  must  be  taken? 

A.  Turn  *on*  the  generator's  AC  contactor  switch  and 
connect  to  load  bank. 

B.  Assure  load  bank  Is  properly  grounded. 

C.  Be  sure  all  shock  toad  twitches  are  In  the  *on*  position. 
0.  Close  the  cable  doors. 

33.  How  do  you  prepare  the  tights  on  a  tight  cart  for  shipnent? 

A.  RcMve  and  box. 

B.  Secure  In  position. 

C.  Stow  In  the  internal  brackets. 

0.  Disconnect  from  sockets  and  tape. 

34.  Mhleh  of  the  following  statenents  Is  true  concerning  safety 
wiring  Mthods? 

A.  Always  use  the  doubte-twlst  nethod. 

B.  The  doubte-twist  leethod  Is  recoaaaended  for  use  on  screws 
In  a  closely  spaced  pattern. 

C.  The  singte-twist  nethod  Is  used  In  places  that  are 
difficult  to  reach. 

D.  The  s1ng1e>tw1st  nethod  Is  the  nost  connonly  used  one. 

35.  Nhat  vatves  nust  be  closed  to  build  up  pressure  on  an  NC-1A 
air  conpressor? 

A.  Dehydrator  bleed  vatve  and  receiver  drain  valve. 

B.  Dehydrator  bleed  valve  and  regulator  Isolation  valve. 

C.  Regulator  Isolation  valve  and  air  service  valve. 

D.  Priority  valve  and  air  service  valve. 
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36.  How  would  you  describe  the  correct  angle  at  which  copper 
tubing  should  be  cut? 

A.  At  a  slight  angle. 

B.  At  a  4S  degree  angle. 

C.  At  any  angle. 

0.  Square. 

37.  Nhat  HMSt  you  do  to  the  bearing  cone  and  rollers,  the 
spindle,  and  races  before  Installing  the  Inner  bearing  onto 
the  hub? 

A.  Clean  thoroughly. 

B.  Clean.  Inspect,  and  sufficiently  grease. 

C.  Always  replace  thea  with  new  parts  and  then  grease. 

O.  Nothing  required,  Just  reinstall. 

3B.  Mhat  type  of  equipment  Is  covered  In  the  35C2  technical  order 
series? 

A.  Air  compressors. 

B.  Generators. 

C.  Heaters. 

P.  Test  stands. 

39.  Nhich  of  the  following  would  {{QI  be  a  concern  when  Inspecting 
the  V-band  clamp  on  a  bleed  air  hose? 

A.  Tool  marks  and  cracks. 

B.  Spreading  at  the  open  ends. 

C.  Radial  distortion. 

0.  Discoloration. 
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40.  Identify  the  END  BELL. 

A.  1 

B.  2 

C.  4 
0.  B 

41.  Identify  the  END  BELL  BAND. 
A.  1 

8.  2 
C.  4 
0.  8 

42.  Identify  the  ARMATURE. 

A.  3 

8.  5 

C.  6 

0.  7 

43.  Identify  the  STATOR. 

A.  S. 

8.  6. 

C.  7. 

D.  8. 
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44.  lAiat  Is  the  first  thing  you  do  with  the  heat  shrink 
Insulation  when  soldering  two  wires  together? 


A.  Slide  It  ever  one  end  of  the  exposed  conductor  and  apply 
heat  to  one  side  to  tack  the  Insulation  In  place  before 
beginning. 

B.  Slide  It  over  one  end  of  the  exposed  conductor  and  slide 
It  up  the  lead  and  out  of  the  way. 

C.  The  heat  shrink  Insulation  Is  not  needed  until  the 
splice  Is  coegtlete;  therefore.  It  should  be  set  out  of 
the  way. 

0.  Split  evenly  down  one  side  to  allow  for  proper 
Installation  and  proper  shrinkage. 


4S.  Milch  of  the  following  Methods  should  be  used  to  secure  the 
outer  bearing  after  replacing  the  wheel  asseMbly? 

A.  Install  nut  and  torque  to  2S  foot  pounds. 

I.  Mille  rotating  wheel,  tighten  the  nut  until  noticeable 
resistance  Is  felt;  back  nut  off  one  full  turn. 

C.  Tighten  nut  until  heavy  drag  Is  felt;  rotate  wheel; 
tighten  until  next  castellatlon. 

0.  While  rotating  the  wheel,  tighten  the  nut  until  heavy 
drag  Is  felt;  back  off  to  first  castellatlon. 

44.  What  Is  the  correct  type  of  cleaner  for  use  on  Indicator 
light  receptacles  or  connecters? 

A.  Contact  cleaner. 

B.  Acid  cleaner. 

C.  Solvent. 

8.  Ewery  cloth. 

47.  What  sort  of  Infonaatlon  gees  Into  the  *QTY*  block  on  AFTO 
Bone  3S0  If  transporting  an  NF>2  with  a  cracked  deer  hinge? 

A.  Crew  site. 

B.  NuMber  of  units. 

C.  Nunber  of  parts. 

B.  AMount  of  tine  required  for  repairs. 
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48.  Hhat  does  the  warning  horn  on  a  hydrauVlc  test  stand  Indicate? 

A.  Low  fuel. 

B.  Low  reservoir  level. 

C.  Low  boost  pressure. 

D.  High  fluid  temperature. 


49.  Which  of  the  following  Is  an  activity  code  that  could  be 
found  on  an  AF  Form  2005  (Supply  Issue  and  Turn  In  Form)? 

A.  X 

B.  R5 

C.  622 

0.  2124 

50.  Which  of  ihe  following  would  HJiT  be  found  on  a  HJ-2A 
hydraulic  test  stand  control  panel? 

A.  Air  pressure  gage. 

B.  Supply  Inlet  gage. 

C.  Exhaust  temperature  gage. 

0.  Stand  reservoir  pressure  gage. 


51.  What  Is  meant  by  crimping  a  connector? 

A.  To  bend  at  either  side  In  order  to  prevent  connector 
slippage. 

B.  To  press  together  on  either  side  to  form  a  solid 
connection. 

C.  To  lengthen  a  short  wire,  without  replacing  the  entire 
wire  assembly. 

0.  To  shorten  a  wire,  without  replacing  the  entire  wire 
assembly. 
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APPENDIX  C 
RATING  KORMS 
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SUPERVISORY  RATING  FORK 


Supervisor: _ 

Please  use  the  scale  below  to  rate  ______________________ 

on  their  KNOWLEDGE  In  eight  general  areas  of  the  AGE  career  field.  The 
scale  Is  repeated  at  the  top  of  each  page.  The  eight  areas  are  listed 
with  definitions  of  each.  Write  your  rating  of  the  Individual's 
knowledge  In  the  space  to  the  right  of  each  area. 


KNOWLEDGE  RATING  SCALE 

S  Able  to  recognize  and  Identify  conponents  In  conplex 
and  comon  system.  Knows  all  procedures  and  systeai 
relationships.  Knows  mny  trouble  shooting  Mthods. 
Aware  of  all  safety  precautions. 

4  Able  to  recognize  and  Identify  components  of  some 
complex  systems  and  most  common  systems.  Knows  most 
procedures  and  system  relationships.  Knows  some 
troubleshooting  methods.  Aware  of  most  safety 
precautions. 

3  Able  to  recognize  and  Identify  components  of  most 
coemnn  systems.  Knows  many  procedures  and  system 
relationships.  Knows  how  to  find  the  troubleshooting 
charts.  Aware  of  many  safety  precautions. 

2  Able  to  recognize  and  Identify  some  components  of 
comnon  systems.  Knows  some  procedures  and  system 
relationships.  Knows  how  to  find  troubleshooting 
charts  with  some  difficulty.  Aware  of  basic  safety 
precautions. 

1  Able  to  recognize  or  Identify  a  few  components  of 
common  systems.  Knows  very  few  procedures  or  system 
relationships.  Able  to  find  troubleshooting  charts 
only  with  great  difficulty.  Aware  of  a  few  basic 
safety  precautions. 


I.  GENERAL  AGE  MAINTENANCE  RATING  -  _ 

Knowledge  of  common  hand  tools,  special  tools,  test  equipment, 
and  shop  support  equipment  for  the  use  of  Isolating  and  correcting 
malfunctions  by  removing,  repairing,  and  replacing  components.  This 
Includes  knowledge  concerning  tasks  such  as  lockwire  Installation, 
corrosion  treatment,  and  minor  structural  repair. 

II.  AGE  ADMINISTRATIVE  FUNCTIONS  RATING  -  _ 

Knowledge  of  technical  orders  systems  for  the  purpose  of 
locating  maintenance  Information  and  completing  required  entries  In 
maintenance  forms.  Example:  knows  how  to  research  and  Identify 
parts  using  IPBs  and  then  make  proper  entries  In  AFTO  Forms  244,  350, 
or  AF  Form  2005. 
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KNOWLEDGE  RATING  SCALE 


S  Able  to  recognize  and  Identify  components  In  complex 
and  common  systems.  Knows  all  procedures  and  system 
relationships.  Knows  many  troubleshooting  methods. 
Aware  of  all  safety  precautions. 

4  Able  to  recognize  and  Identify  components  of  some 
complex  systems  and  most  common  systems.  Knows  most 
procedures  and  system  relationships.  Knows  some 
troubleshooting  methods.  Aware  of  most  safety 
precautions. 

3  Able  to  recognize  and  Identify  components  of  most 
coenon  systems.  Knows  many  procedures  and  system 
relationships.  Knows  how  to  find  the  troubleshooting 
charts.  Aware  of  many  safety  precautions. 

2  Able  to  recognize  and  Identify  some  components  of 
common  systems.  Knows  some  procedures  and  system 
relationships.  Knows  how  to  find  troubleshooting 
charts  with  some  difficulty.  Aware  of  basic  safety 
precautions. 

1  Able  to  recognize  or  Identify  a  few  components  of 
common  systems.  Knows  very  few  procedures  or  system 
relationships.  Able  to  find  troubleshooting  charts 
only  with  great  difficulty.  Aware  of  a  few  basic 
safety  precautions. 


III.  AGE  GAS  TURBINE  MAINTENANCE  RATING  -  _ 

Knowledge  required  for  Isolating  and  correcting  malfunctions 
within  the  electrical,  pneumatic,  fuel  and  lubrication  systems  of  gas 
turbine  compressors.  This  Includes  knowledge  of  procedures  required 
for  removing,  replacing,  cleaning  and  adjusting. 


IV.  AGE  PERIODIC  INSPECTIONS  RATING  -  _ 

Knowledge  of  scheduled  preventative  maintenance  actions  as 
outlined  In  the  appropriate  technical  data.  This  Includes  knowledge 
of  the  system  on  which  the  periodic  Inspection  Is  performed. 


V.  AGE  PNEUORAULIC  SYSTEM  MAINTENANCE  RATING  -  __ 

Knowledge  required  to  Isolate  and  correct  malfunctions  In  AGE 
pneuMtIc  and  hydraulic  systems.  This  Includes  knowledge  of 
procedures  required  for  removing,  replacing,  adjusting,  and 
performing  operational  checks. 
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KNOWLEDGE  RATING  SCALE 

5  Able  to  recognize  and  Identify  components  In  complex 
and  coanon  systems.  Knows  all  procedures  and  system 
relationships.  Knows  many  troubleshooting  methods. 
Aware  of  all  safety  precautions. 

4  Able  to  recognize  an^  Identify  components  of  some 
complex  systems  and  most  coanon  systems.  Knows  most 
procedures  and  system  relationships.  Knows  some 
troubleshooting  methods.  Aware  of  most  safety 
precautions. 

3  Able  to  recognize  and  Identify  components  of  most 
comon  systems.  Knows  many  procedures  and  system 
relationships.  Knows  how  to  find  the  troubleshooting 
charts.  Aware  of  many  safety  precautions. 

2  Able  to  recognize  and  Identify  some  components  of 
common  systems.  Knows  some  procedures  and  system 
relationships.  Knows  how  to  find  troubleshooting 
charts  with  some  difficulty.  Aware  of  basic  safety 
precautions. 

1  Able  to  recognize  or  Identify  a  few  components  of 
common  systems.  Knows  very  few  procedures  or  system 
relationships.  Able  to  find  troubleshooting  charts 
only  with  great  difficulty.  Aware  of  a  Jew  basic 
safety  precautions. 


VI.  AGE  RECIPROCATING  ENGINE  MAINTENANCE  RATING  -  _ 

Knowledge  rcguircd  to  Isolate  and  correct  malfunctions  In  AGE 
gasoline  and  diesel  engines.  Examples:  knowledge  of  complex 
maintenance  actions  such  as  removal  and  replacement  of  a  cylinder 
assembly;  knowledge  required  for  routine  tasks  such  as  removing  and 
replacing  engine  thermostats  or  oil  pressure  switches. 


VII.  AGE  ELECTRONIC  STSTEN  MAINTENANCE  RATING  -  _ 

Knowledge  required  to  Isolate  and  correct  malfunctions  In 
electrical  and  electronic  circuits  and  components.  It  Includes  the 
knowledge  required  to  splice,  solder,  treat  corrosion,  adjust,  clean, 
reomve,  replace  and  owasure  voltage  and  resistance. 


VIII.  AGE  PICK-UP,  DELIVERY  AND  SERVICE  FWICTIONS  RATING  -  _ 

Knowledge  required  to  prepare  units  for  use  and  expediting 
delivery  to  the  fllghtllne.  Examples:  knowledge  required  to  perform 
service  Inspections,  service  fuel  and  oil.  exercise  proper  towing  and 
positioning  procedures,  operate  two-way  radios  and  clean  vehicles. 
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AEROSPACE  GROUND  EQUIPMENT 
AFSC  4S4X1 
SELF-RATING  FORMS 


AFPT  80-423-205 


GENERAL  BACKGROUND  INFORMATION 

YOUR 

NAME _  SSAN  __ 

L«St  First  Ml 

MONTHS  IN  SERVICE:  _ 

MONTHS  IN  CAREER  FIELD:  _ 

SKILL  LEVEL:  _ 

BASE:  _ 
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KW0WLCD6E  HATIMfi  FORM 

Please  use  the  KNOWLEDGE  RATING  SCALE  below  to  rate  yourself  on  the 
aanunt  of  KNOWLEDGE  you  have  In  the  eight  general  areas  of  the  AGE  career 
field.  The  eight  areas  are  listed  with  definitions  of  each.  Write  your 
rating  of  your  own  knowledge  In  the  space  to  the  right  of  each  area. 


KNOWLEDGE  RATING  SCALE 
5  *  Very  Great  Aiaount  of  Knowledge 
4  -  Great  Aanunt  of  Knowledge 
3  -  Noderate  Aanunt  of  Knowledge 
2  -  Saull  Aiaount  of  Knowledge 
1  -  None  or  Alaost  No  Knowledge 


I.  GENERAL  AGE  NAINTENANCE  RATING  -  _ 

Knowledge  of  connon  hand  tools,  special  tools,  test  equlpnant, 
and  shop  support  equlpaant  for  the  use  of  Isolating  and  correcting 
Malfunctions  by  reanving.  repairing,  and  replacing  coag>onents.  This 
Includes  knowledge  concerning  tasks  such  as  lockwire  Installation, 
corrosion  treatawnt,  and  alnor  structural  repair. 


II.  AGE  AONINISTRATIVE  FUNCTIONS  RATING  -  _ 

Knowledge  of  technical  orders  systeins  for  the  purpose  of 
locating  RMintenance  Infonaatlon  and  completing  required  entries  In 
maintenance  forms.  Example:  knows  how  to  research  and  Identify 
parts  using  IPGs  and  then  make  proper  entries  In  AFTO  Forms  244,  350, 
or  AF  Form  2005. 


III.  AGE  GAS  TURBINE  NAINTENANCE  RATING  -  __ 

Knowledge  required  for  Isolating  and  correcting  malfunctions 
irlthin  the  electrical,  pneumatic,  fuel,  and  lubrication  systems  of 
gas  turbine  compressors.  This  Includes  knowledge  of  procedures 
required  for  removing,  replacing,  cleaning,  and  adjusting. 


IV.  AGE  PERIODIC  INSPECTIONS  RATING  -  _ 

Knowledge .of  scheduled  preventative  maintenance  actions  as 
outlined  In  the  appropriate  technical  data.  This  Includes  knowledge 
of  the  system  on  which  the  periodic  Inspection  Is  performed. 
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KN0ULE06E  RATING  SCALE 


5  -  Very  Great  Anount  of  Knowledge 
4  -  Great  Aaiount  of  Knowledge 
3  -  Moderate  Amount  of  Knowledge 
2  -  Small  Amount  of  Knowledge 
1  -  None  or  Almost  No  Knowledge 


V.  AGE  PNEUORAULIC  SYSTEM  MAINTENANCE  RATING  «  _ 

Knowledge  required  to  Isolate  and  correct  malfunctions  In 
AGE  pneumatic  and  hydraulic  systems.  This  Includes  knowledge  of 
procedures  required  for  removing,  replacing,  adjusting,  and 
performing  operational  checks. 


VI.  AGE  RECIPROCATING  ENGINE  MAINTENANCE  RATING  >  _ 

Knowledge  required  to  Isolate  and  correct  malfunctions  In 
AGE  gasoline  and  diesel  engines.  Examples:  knowledge  of 
complex  maintenance  actions  such  as  removal  and  replacement  of  a 
cylinder  assembly:  knowledge  required  for  routine  tasks  such  as 
removing  and  replacing  engine  thermostats  or  oil  pressure 
switches. 


VII.  AGE  ELECTRONIC  SYSTEM  MAINTENANCE  RATING  -  _ 

Knowledge  required  to  Isolate  and  correct  malfunctions  In 
electrical  and  electronic  circuits  and  components.  It  Includes 
the  knowledge  required  to  splice,  solder,  treat  corrosion, 
adjust,  clean,  remove,  replace,  and  measure  voltage  and 
resistance. 


VIII.  AGE  PICK-UP.  DELIVERY,  AND  SERVICE  FUNCTIONS  RATING  •  _ 

Knowledge  required  to  prepare  units  for  use  and  expediting 
delivery  to  the  fllghtllne.  Examples:  knowledge  required  to 
perform  service  Inspections,  service  fuel  and  oil,  exercise 
proper  towing  and  positioning  procedures,  operate  tira-way 
radios,  and  clean  vehicles. 
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EXPERIENCE  AMD  THAIWIHS  KATIWfi  FORM 


Please  use  the  RATING  SCALE  below  to  rate  yourself  on  the  amount  of 
EXPERIENCE  and  TRAINING  you  have  received  In  the  eight  general  areas  of 
the  AGE  career  field.  The  eight  areas  are  listed  with  definitions  of 
each.  Write  your  rating  In  the  space  to  the  right  of  each  area. 


RATING  SCALE 

S  -  Very  Great  Amount  of  Experience  and  Training 
4  -  Great  Amount  of  Experience  and  Training 
3  -  Moderate  Amount  of  Experience  and  Training 
2  -  Small  Amount  of  Experience  and  Training 
1  -  None  or  Almost  No  Experience  and  Training 


I.  GENERAL  AGE  MAINTENANCE  RATING  -  _ 

Use  of  common  hand  tools,  special  tools,  test  equipment,  and 
shop  support  equipment  for  Isolating  and  correcting  malfunctions  by 
removing,  repairing,  and  replacing  components.  This  Includes  tasks 
such  as  locktrlre  Installation,  corrosion  treatment,  and  minor 
structural  repair. 


II.  AGE  ADMINISTRATIVE  FUNCTIONS  RATING  -  _ 

Use  of  technical  orders  systems  for  the  purpose  of  locating 
amlntenance  Information  and  completing  required  entries  In 
maintenance  forms.  Example:  research  and  Identify  parts  using  IPGs 
and  then  make  proper  entries  In  AFTO  Forms  244.  350,  or  AF  Form  2005. 


111.  AGE  GAS  TURBINE  MAINTENANCE  RATING  -  _ 

Isolates  and  corrects  malfunctions  within  the  electrical, 
pneumatic,  fuel,  and  lubrication  systems  of  gas  turbine  compressors. 
This  Includes  removing,  replacing,  cleaning,  and  adjusting. 


IV.  AGE  PERIODIC  INSPECTIONS  RATING  -  _ 

Conducts  scheduled  preventative  maintenance  actions  as  outlined 
In  the  appropriate  technical  data. 
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RATING  SCALE 


5  -  Very  Great  Anount  of  Experience  and  Training 
4  Great  Aiaount  of  Experience  and  Training 
3  -  Moderate  Anount  of  Experience  and  Training 
2  -  Snail  Anount  of  Experience  and  Training 
1  -  None  or  Alanst  No  Experience  and  Training 


V.  AGE  PNEUORAULIC  SYSTEM  MAINTENANCE  RATING  •  _ 

Isolates  and  corrects  nalfunctlons  In  AGE  pneunatic  and 
hydraulic  systems.  This  Includes  removing,  replacing,  adjusting,  and 
performing  operational  checks. 


VI.  AGE  RECIPROCATING  ENGINE  MAINTENANCE  RATING  •  _ 

Isolates  and  corrects  malfunctions  In  AGE  gasoline  and  diesel 
engines.  Examples:  performs  complex  maintenance  actions  such  as 
removal  and  replacement  of  a  cylinder  assembly;  performs  routine 
tasks  such  as  removing  and  replacing  engine  thermostats  or  oil 
pressure  switches. 


.VII .  AGE  ELECTRONIC  SYSTEM  MAINTENANCE  RATING  > _ 

Isolates  and  corrects  malfunctions  In  electrical  and  electronic 
circuits  and  components.  Includes  splicing,  soldering,  treating 
corrosion,  adjusting,  cleaning,  removing,  replacing,  and  measuring 
voltage  and  resistance. 


VIII.  AGE  PICK-UP,  DELIVERY,  AND- SERVICE  FUNCTIONS  RATING  -  _ 

Prepares  units  for  use  and  expedites  delivery  to  the 
fllghtllne.  Examples:  performs  service  Inspections,  services  fuel 
and  oil,  exercises  proper  towing  and  positioning  procedures,  operates 
two-way  radios,  and  cleans  vehicles. 
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